Previous Article | Next Article ![]()
Eukaryotic Cell, December 2005, p. 2098-2105, Vol. 4, No. 12
1535-9778/05/$08.00+0 doi:10.1128/EC.4.12.2098-2105.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Department of Molecular, Microbial and Structural Biology, University of Connecticut Health Center, Farmington, Connecticut 06030-3305
Received 28 August 2005/ Accepted 11 October 2005
|
|
|---|
|
|
|---|
A number of genes in ciliated protozoa of the genus Euplotes (class Spirotrichea) have been identified that appear to require a +1 translational frameshift to produce their protein products. The putative +1-frameshift genes encode the regulatory subunit of cyclic AMP-dependent protein kinase and a nuclear protein kinase of Euplotes octocarinatus (39, 40), a La motif protein (p43) in Euplotes aediculatus (1), the Euplotes crassus Tec2 transposon ORF2 protein (11, 18), and the reverse transcriptase subunits of telomerase (TERT) in three euplotid species (20, 29, 42). Since the complete sequences of less than 100 Euplotes genes have been determined, it appears that frameshifting is unusually common in euplotids, with perhaps >5% of genes requiring a frameshift for expression (reviewed in reference 22).
Neither the mechanism nor the sequence requirements for the +1 frameshift in Euplotes genes are known. However, all of the genes have a lysine codon (AAA) followed by a termination codon (TAA, except for one instance in which it is TAG) at the end of the initial open reading frame (0 frame). This "Euplotes frameshift motif" (5'-AAA-TA(A/g)-3'; three-base groupings denote codons in the 0 frame) bears similarities to the "shifty stop" type of sequence elements required for +1 frameshifting in genes of other organisms (reviewed in references 6 and 36). A shifty stop typically contains a codon that would allow its cognate tRNA positioned in the ribosome P-site to undergo a +1 shift in reading frame and still maintain pairing with two bases in the mRNA; the AAA lysine codon of the Euplotes frameshift motif fulfills this criterion. The second feature is a poorly recognized termination tetranucleotide (i.e., the termination codon plus the following nucleotide), which is thought to slow or stall the ribosome, allowing an opportunity for a shift in reading frame. Here the Euplotes frameshift motif does not appear to conform to a shifty stop site, as TAA-A is most frequently found at frameshift sites, and this is the most frequent tetranucleotide at true sites of translation termination (22). It is possible that another undefined feature of the Euplotes frameshift mRNAs is responsible for slowing translation. Alternatively, it has been suggested that a second unusual genetic feature of Euplotes, stop-codon reassignment, is involved in slowing translation and promoting the frameshifts (22). Euplotids have reassigned the UGA stop codon of the universal code so that it now encodes cysteine (16, 28). This has occurred, in part, as a result of changes to the single eukaryotic translation release factor 1 protein (eRF1) so that it no longer recognizes the UGA stop codon (9, 21, 34a). It is possible that these alterations to Euplotes eRF1 have also impaired its ability to recognize the remaining UAA and UAG stop codons. If so, translation termination would be a generally slow process in Euplotes, and encountering the stop codon within the frameshift motif would provide a pause that facilitates the +1 frameshift.
Whatever the mechanism, the apparent high frequency of euplotid genes requiring frameshifts is unprecedented. While the current data suggest that
5% of euplotid genes require one or more frameshifts for expression, there are a number of reasons to view this number with suspicion. First, it is based on a relatively small sample of 67 genes (22). Second, the gene sequences derive from seven different Euplotes species and, in some cases, orthologous genes were included from the different species. Third, the gene sample was not random, and in all likelihood was biased towards highly expressed genes, as many of the genes encode tubulins, histones, and proteins involved in translation. To more accurately assess the frequency of frameshift-requiring genes in a single Euplotes species, 25 randomly selected Euplotes crassus macronuclear chromosomes, which typically contain single genes, were completely sequenced. Three novel genes requiring +1 translational frameshifts have been identified, all of which are shown to be expressed in vegetatively growing cells. The results support a high frequency of +1 translational frameshifting in euplotids and, indeed, suggest that the frequency of such genes may exceed 10%. The functions of the encoded frameshift proteins are also discussed in regard to the possible role of frameshifting in the coordinate regulation of gene expression.
|
|
|---|
Cloning of macronuclear DNA molecules. To construct small recombinant libraries of macronuclear DNA molecules, the single-stranded regions of telomeres were removed by treatment of 6 µg of E. crassus strain X1 DNA with 10 units of T4 DNA polymerase for 15 min at 37°C in the presence of all four deoxynucleotide triphosphates at a final concentration of 200 µM each. The DNA was then ligated into either the SmaI or HincII site of the pBluescript SK(+) phagemid (Stratagene, La Jolla, CA), transformed into Escherichia coli TOP10 chemically competent cells (Invitrogen, Carlsbad, CA), and the cells were spread on plates containing 50 µg/ml ampicillin and 40 µl of 20 µg/ml X-gal (5-bromo-4-chloro-3-indolyl-ß-D-galactopyranoside). White colonies were randomly selected and expanded, and DNA was prepared using either the Wizard Plus Miniprep kit (Promega, Madison, WI) or the QIAprep Spin Miniprep kit (QIAGEN, Valencia, CA). Sizes of macronuclear DNA inserts were determined by digestion with either EcoRI + BamHI or KpnI + BamHI restriction enzymes and electrophoresis on 0.8% agarose gels prepared and run in 1x TBE (89 mM Tris, 89 mM H3BO3, and 2 mM disodium EDTA; pH 8.3).
DNA sequencing. All DNA sequencing was performed by the University of Connecticut Health Center Molecular Core facility using the Taq Dyedeoxy Termination cycle sequencing kit (Perkin Elmer Cetus, Norwalk, CT). For sequencing of the cloned macronuclear DNA molecules, the initial reactions employed the T3 (5'-ATTAACCCTCACTAAAGGGA-3') and T7 (5'-TAATACGACTCACTATAGGG-3') sequencing primers. As necessary, additional sequencing reactions were performed to extend the sequences, using oligonucleotide primers that were designed based on the initial sequence reads, until the complete sequences of the macronuclear DNA molecules were obtained. Five clones (pEC4, pEC5, pEC8, pEC9, and pEC10) were lost prior to the completion of sequencing. In these instances, the missing segments of DNA were obtained by PCR from total cellular DNA, and the PCR products were directly sequenced to complete the sequences of the macronuclear DNA molecules. All primers for sequencing and PCR were purchased from Invitrogen, and their sequences are available on request. The sequences of the macronuclear clones have been deposited in GenBank under accession numbers DQ114948 to DQ114975.
PCR and reverse transcription-PCR. PCR was carried out using 100 ng of E. crassus strain X1 genomic DNA as the substrate and KlenTaq DNA polymerase (Sigma, St. Louis, MO) under conditions specified by the manufacturer. Twenty-five cycles of PCR were carried out, with a cycle consisting of a 95°C denaturation step for 1 min, a 1-min annealing step, and a 72°C elongation step for 30 seconds to 1 min, depending on the length of the expected product. The temperature for the annealing step was adjusted based on the G+C content of the primers. For sequencing, PCRs were typically run on a low-melting-point agarose (Invitrogen) gel, and the PCR product was excised and purified as described by Qian and Wilkinson (33).
Reverse transcription (RT)-PCR was performed using the SuperScript One-Step RT-PCR with Platinum Taq kit (Invitrogen). The reactions were performed according to the manufacturer's protocol, using 200 ng of E. crassus strain CT5 (mating type III) total RNA as the substrate and 30 cycles of PCR following the reverse transcription step. To assess any possible DNA contamination of the RNA preparation, control reactions lacking the reverse transcription step were performed by adding the substrate RNA to reactions after the 94°C step that inactivates the reverse transcriptase enzyme, but prior to PCR amplification.
The following pairs of oligonucleotides were used for genomic PCR and RT-PCR analyses of the pEC2, pEC14, and pEC26 putative frameshift genes, respectively: pEC2F (5'-AGGAGGCATTCCCACTTTTG-3') and pEC2R (5'-TGATGAAGCAGAAGCTGGTG-3'), pEC14F (5'-ACTCATCCATGCAGACGGTG-3') and pEC14R (5'-TTTTTCCAAATTCCCTCTCG-3'), and R3EC26 (5'-TATCCCTGGGAATGCACAAA-3') and 3EC26 (5'-TGGTAGTCCTGTTCCTTTCC-3').
Bioinformatic and statistical analyses. A confidence interval for the frequency of E. crassus frameshift genes was calculated using the Blyth-Still-Casella method (7, 8) and StatXact-4 for Windows software (Cytel Software Corp., Cambridge, MA).
Sequences of the macronuclear DNA molecules, with telomeric repeats removed, were used in BLASTx and BLASTn analyses (see reference 27) of the nonredundant protein and nucleotide GenBank databases at the NCBI website (http://www.ncbi.nlm.nih.gov/blast/). Default parameters were employed, except that the euplotid nuclear genetic code was employed in BLASTx searches. For genes that failed to produce strong matches, selected long open reading frames were also used in BLASTp searches of the GenBank nonredundant protein database, BLASTp searches of the Tetrahymena thermophila preliminary gene predictions generated by The Institute for Genome Research (August 2004; http://tigrblast.tigr.org/er-blast/index.cgi?project=ttg), and tBLASTn searches of the Paramecium tetraurelia macronuclear genome sequences (http://paramecium.cgm.cnrs-gif.fr/blast/) and the Tetrahymena thermophila macronuclear genome (Assembly 2, November 2003; http://tigrblast.tigr.org/er-blast/index.cgi?project = ttg). In all BLAST analyses, only matches with expect (E) values of <106 were considered significant. In cases where a frameshift was suspected, predicted proteins were generated assuming that the +1 frameshift occurs following the incorporation of the lysine residue (AAA codon) of the 5'-AAA-TA(A/g)-3' frameshift motif. The macronuclear sequences were also searched for tRNA genes using tRNAscan-SE 1.21 (26) at http://lowelab.ucsc.edu/tRNAscan-SE/, with a score of >40 considered significant, and some putative open reading frames (ORFs) were analyzed for conserved PROSITE domains and motifs (http://au.expasy.org/prosite/).
WebLogo (10, 35) (http://weblogo.berkeley.edu/logo.cgi) was employed to search for conserved sequences at defined distances from frameshift sites. To search for conserved sequence elements present at variable distances from frameshift sites, ClustalW (41) was used to align the frameshift sequences using both default parameters and reduced gap creation/extension penalties. Multiple Em for Motif Elicitation, version 3 (MEME; http://meme.sdsc.edu/meme/website/meme.html) (5) was also used to search for conserved elements using a variety of parameters.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Characteristics of sequenced macronuclear DNA molecules
|
Functions of the macronuclear DNA molecules. To determine the possible functions of any genes within the macronuclear molecules, BLASTn and BLASTx searches (3) of the GenBank nonredundant nucleotide and protein databases, respectively, were carried out. In addition, the program tRNAscan-SE (26) was used to search for tRNA genes. In cases where these searches failed to identify possible gene functions, tBLASTn searches of the DNA sequence databases of the ciliates Tetrahymena thermophila and Paramecium tetraurelia were conducted in an attempt to determine whether ciliate homologs of the genes existed, and a conceptual translation of the longest open reading frame of the macronuclear insert was used in a BLASTp search of the protein database in a further attempt to identify any possible related proteins or sequence motifs that would be indicative of function.
For the BLAST searches, only database matches with expect (E) values of
1 x 106 were considered significant. Based on this criterion, 11 of the 25 macronuclear DNA molecules encode proteins of known function or have homologs in other nonciliate organisms (Table 1). In addition, the pEC1 macronuclear DNA molecule was found to encode a U2 small nuclear RNA (snRNA), and pEC12 is predicted to encode an isoleucine tRNA with an AAU anticodon. Of the 11 remaining macronuclear DNA molecules, two (pEC2 and pEC5) gave significant hits in the tBLASTn searches of the Tetrahymena and Paramecium genomic sequences (Table 1), indicating that these macronuclear DNA molecules likely encode proteins that are at least conserved among ciliates. Overall, evidence for a possible gene function, or at least evidence for the presence of a functional gene, was obtained for 15 of the 25 cloned macronuclear DNA molecules.
Identification of candidate frameshift genes. The above bioinformatic analyses also provided indications that some of the newly sequenced macronuclear genes require a translational frameshift for expression. Such genes are expected to generate two separate hits to different regions of the same protein in BLASTx searches, with the two matching regions of the macronuclear DNA molecule encoding polypeptides in different reading frames. Such results can also be due to the presence of introns, which are rare in spirotrich genes (17), but an intron is also expected to result in a gap in the alignment of the conceptually translated DNA sequence with the protein homolog, while genes requiring a frameshift should not display such an alignment gap. In addition, based on the typical arrangement of +1 frameshift genes in Euplotes, the initial open reading frame (reading frame 0) is expected to terminate with the sequence 5'-AAA-TA(A/g)-3' (22).
Based on these criteria, three of the E. crassus macronuclear genes are strong candidates for requiring +1 frameshifts for expression. The pEC14 macronuclear DNA molecule generated strong matches in BLASTx searches to proteins containing MORN repeats (Table 1). The MORN repeat is a 14-amino-acid (aa) motif that is found in multiple copies in a number of functionally distinct proteins (for examples, see references 19, 24, and 38). In the case of the protein junctophilin, the MORN repeats have been implicated in this protein's association with the plasma membrane, giving rise to the MORN acronym ("membrane occupation and recognition nexus"). In the pEC14 macronuclear DNA molecule, an initial open reading frame (0 frame ORF) encodes six complete MORN repeats and terminates with part of a seventh, while the second +1 open reading frame (+1 frame ORF) contains the remainder of the seventh MORN repeat plus two additional complete repeats (Fig. 1a). The 0 frame ORF terminates with the frameshift motif sequence 5'-AAA-TAA-3', making it likely that a +1 translational frameshift joins the 0 and +1 reading frames to generate a protein of 354 aa that contains a total of nine MORN repeats.
![]() View larger version (24K): [in a new window] |
FIG. 1. Maps of macronuclear chromosomes containing putative frameshift genes and organization of their coding regions. Horizontal black bars denote the pEC14 (a), pEC2 (b), and pEC26 (c) macronuclear DNA molecules, with the positions of selected initiation codons (ATG), termination codons (TAA or TAG), and frameshift motifs indicated. Rectangles typically denote the 0 and +1 ORFs that can be joined by a +1 frameshift to produce a single protein. Lines terminating with black balls indicate the segments amplified by PCR of genomic DNA and RT-PCR of mRNA to confirm the presence of putative frameshift sites. In the case of pEC26 (c), not all termination codons are shown, for simplicity, and the position of the internal block of telomeric repeats is indicated (G4T4). In addition, the three ORFs (0, +1, and +2 frame) encoding the 12 conserved protein kinase domains (I-XI) are indicated, and two long ORFs that precede the protein kinase coding region are indicated by rectangles with question marks.
|
Macronuclear clone pEC26 is also a strong candidate for a +1 translational frameshift, albeit a more complex one (Fig. 1c). In this case, BLASTx searches identified three overlapping ORFs (183 bp, 441 bp, and 1,077 bp), each shifted +1 relative to the upstream ORF, that conceptually encode parts of a serine/threonine protein kinase (Table 1; the top database hits were to members of the SNF-1-like subfamily). Twelve conserved domains have been identified in protein kinases (15). For pEC26, the 0 frame ORF would encode domains I, II, and part of III, the contiguous +1 ORF would encode the remainder of domain III through part of domain VII, and the +2 ORF would encode the remainder of domain VII through domain XI (Fig. 1c). Therefore, two +1 shifts in reading frame would be required to produce a complete protein kinase domain. It should also be noted that the protein kinase region occupies only a small portion of this 5.39-kbp macronuclear DNA molecule, and it is preceded by two large ORFs of 1,095 bp and 1,293 bp (Fig. 1c). While conceptual translations of these upstream ORFs failed to identify homologs in database searches, the second of the upstream ORFs terminates with a 5'-AAA-TAA-3' frameshift motif, and it is oriented such that a +1 frameshift would translationally link it to the protein kinase region. Thus, it is possible that this gene requires three +1 frameshifts for expression.
One final unusual feature of the pEC26 clone is that it contains a 28-bp block of the Euplotes telomeric repeat sequence 5'-GGGGTTTT-3' (G4T4 repeat) beginning 1,167 bp from the left end of the cloned insert (Fig. 1c). Internal blocks of telomeric repeats of this length, which happens to correspond to the length of the double-stranded region of Euplotes macronuclear telomeres (23), have not previously been seen in internal regions of spirotrich macronuclear DNA molecules. This suggested that the pEC26 clone insert might represent a composite clone derived from all or part of two macronuclear DNA molecules artificially joined during the cloning process. A series of PCR analyses using total genomic DNA as a substrate supported this hypothesis (data not shown). Six combinations of oligonucleotide primers whose binding sites were all located to the right of the internal G4T4 block (as oriented in Fig. 1c) generated PCR products of the expected sizes, consistent with this region constituting a single macronuclear chromosome. In contrast, three combinations of primers whose binding sites bracketed the G4T4 block failed to produce the PCR products predicted from the pEC26 clone. While it is possible that the internal telomeric repeat block or some other feature of this region of pEC26 interferes with successful amplification, the results are consistent with the notion that the pEC26 clone is an artifact in the sense that the regions to the left and right of the G4T4 block are derived from all or parts of two different macronuclear DNA molecules.
To exclude the possibility that cloning or sequencing errors resulted in the appearance of the frameshift sites in the three macronuclear genes, PCRs were carried out on genomic DNA using primers that flanked putative frameshift sites (Fig. 1), and the resulting PCR products were directly sequenced. The sequence of the pEC14 genomic PCR product completely matched that of the clone, confirming the presence of the frameshift site. The genomic PCR sequences for pEC2 and pEC26 essentially matched those of their respective clones, with the exception that four polymorphic positions were seen in the pEC2 genomic PCR product, and three polymorphic positions were found in the pEC26 genomic sequence. In each case, one of the two alternative bases at each polymorphic position matched the sequence of the clone, and the alternative base represented a synonymous change in the coding region. These polymorphisms likely represent allelic variation, as the E. crassus X1 strain is not inbred. Overall, the results indicate that the frameshift sites are indeed present in the three genes, including two alternative forms of the gene in the cases of pEC2 and pEC26.
Expression of the frameshift genes. RT-PCR analyses were carried out to determine if the three newly identified frameshift genes are expressed in vegetative cells and to confirm that the frameshift sites were present in the mRNAs. These analyses utilized total RNA isolated from E. crassus strain CT5 as the substrate, as strain X1 was no longer viable at the time of the analysis, and the same primers used in the previous analysis of genomic DNA (Fig. 1). For each of the genes, PCR products of the expected size were obtained, and these were dependent on the inclusion of the reverse transcription step, indicating that they are not the result of contaminating DNA (Fig. 2). Two additional smaller RT-PCR products were observed in the pEC14 RT-PCR analysis (Fig. 2), but these proved to be nonspecific, as they were shown to be generated by only one of the two pEC14 oligonucleotide primers (data not shown).
![]() View larger version (51K): [in a new window] |
FIG. 2. Agarose gel displaying RT-PCR products from the pEC2, pEC14, and pEC26 genes. Reactions were carried out both in the presence (+) and absence () of a reverse transcription step. Sizes of selected marker DNA fragments are shown to the left in kilobase pairs, and the sizes of RT-PCR products are indicated in kilobase pairs to the right of the gels. Note that for pEC14, the two bands smaller than 0.53 kbp were shown to be nonspecific PCR products.
|
Conserved sequences associated with frameshift sites. With the expanded sample of Euplotes frameshift genes, a number of analyses were carried out to look for conserved sequence elements that might facilitate frameshifting. For these analyses, the 50 bp upstream and downstream of the conserved 5'-AAA-TA(A/g)-3' motif from the 12 known frameshift sites were considered (only single examples of a frameshift site were considered in cases where more than one homolog with a frameshift site has been identified). To search for conserved sequence elements that might exist at a defined distance from frameshift sites, the sequences were aligned at the 5'AAA-TA(A/g)-3' frameshift motif, and individual positions in the aligned sequences were evaluated for information content/sequence conservation using WebLogo (10, 35). The only well-conserved sequence element identified was the 5'-AAA-TA(A/g)-3' frameshift motif itself (Fig. 3). It was previously reported that there was an additional conserved A residue following this motif (22, 40), but that conclusion was based on a small sample of frameshift sites, and many of the sites subsequently identified do not have an A residue at this position. Conserved sequence elements might also exist at a variable distance from the site of the frameshift. Tan et al. (40) noted that the hexanucleotide 5'-CAAGAA-3' was often found upstream of the six then-known frameshift sites. However, exact matches to this hexanucleotide are present in only 4 of the 12 currently known frameshift sites, making it unlikely that this sequence element is important for frameshifting. Additional searches for conserved sequence elements at variable distances from the frameshift motif (see Materials and Methods) failed to identify any highly conserved sequence elements that were shared by all of the frameshift sites.
![]() View larger version (18K): [in a new window] |
FIG. 3. WebLogo displaying sequence conservation in the vicinities of frameshift sites. Sizes of letters denote information content, or sequence conservation, at each position. The analysis is based on the alignment of the 50 bp preceding and following the 5'-AAA-TA(A/g)-3' frameshift motif from the following frameshift sites/genes: the single frameshifts of pEC2 (GenBank accession no. DQ114952), pEC14 (DQ114962), and the three frameshift sites of pEC26 (DQ114969) identified in this study; E. octocarinatus cyclic AMP-dependent protein kinase (AJ238280) (39); E. aediculatus p43/La motif protein (AF307939) (1); E. octocarinatus npk2/Eondr2 (AJ249684) (40); E. crassus orf2 of transposon Tec2-1 (L03360) (18); frameshift sites 1 and 2 of E. crassus TERT-1 (AF528527) (42); and frameshift site 3 of Euplotes minuta TERT (AY303934) (29).
|
|
|
|---|
Frequency of frameshift genes in E. crassus. Considering only the macronuclear chromosomes that do not encode untranslated RNA products, 3 of 23 (13%) were found to require frameshifts. This value is likely an underestimate of the percentage of genes requiring a frameshift, as the strategy for defining a frameshift site depended on the identification of a homologous gene in another organism. That is, some of the macronuclear chromosomes whose functions are unknown may also require a frameshift for expression, and, indeed, there are cases in this subpopulation where ORFs of unknown function could be joined to each other by a +1 frameshift at a 5'-AAA-TAA-3' sequence.
The observed 13% frequency of frameshifting somewhat exceeds the value of
7.5% (5 of 67 genes) obtained from a previous survey of genes available in GenBank (22) and provides support for the notion that euplotids possess an extremely high number of genes requiring +1 frameshifts for expression. While there is still considerable uncertainty as to the true percentage of frameshift genes, as a result of the small sample size, the current data provide a 95% confidence interval of 3.7 to 31.7% for the percentage of frameshift genes. Even the 3.7% value at the lower end of this range is >100-fold higher than the reported frequency of frameshift genes in other organisms, such as yeast, where only 2 nontransposon genes (4, 31) of the
6,000 total protein-coding genes in the genome (
0.03%) have been reported to require frameshifts for expression.
Efficiency of frameshifting. A previous evolutionary analysis of the TERT genes in a number of Euplotes species (29) indicated that frameshift sites have arisen during the diversification of euplotids. Coupled with the observed high frequency of frameshift genes, this led to the suggestion that euplotids may possess an efficient mechanism of +1 frameshifting, such that mutations resulting in appropriately oriented reading frames joined by the 5'-AAA-TA(A/g)-3' frameshift motif would be selectively neutral. In genes from other organisms that require frameshifts, the frequency at which the ribosome shifts reading frames varies considerably, but can be as high as 80% (reviewed in references 12 and 13). While information of this type is not available for Euplotes, the current results and past studies suggest that there are some constraints on the types of genes containing frameshift sites, and, thus, that not all ribosomes undergo a frameshift. Specifically, frameshift sites occur predominantly in genes encoding proteins with enzymatic functions, as opposed to genes encoding abundant proteins in the cell. Eight different types of Euplotes genes have been identified to date with frameshift sites. Five encode enzymes (three protein kinases, TERT, and the Tec2 tyrosine recombinase), and the p43 La motif protein is associated with the RNA component of the telomerase enzyme (2) and appears to anchor it in the nucleus (30). The functions of the remaining two proteins (the pEC14/MORN repeat protein and the pEC2 protein) are unknown, but there is no reason to suspect that they might be abundant in the cell. In contrast, the complete coding sequences for 27 genes encoding tubulins, histones, and ribosomal proteins in seven different Euplotes species are currently listed in GenBank (as of July 2005), and none have been reported to require a frameshift for expression. Tubulins, histones, and ribosomal proteins are almost certainly among the most abundant proteins in the cell, and the absence of any genes requiring a frameshift among this reasonably sized sample suggests that frameshift sites are not tolerated within highly expressed proteins. The apparent avoidance of frameshift sites in genes encoding abundant proteins suggests that frameshifting may reduce the level of translated protein, so that there may be selection against alleles with frameshift sites for either genes encoding abundant proteins or for genes whose protein products are at or near critical levels in the cell.
Is Euplotes +1 translational frameshifting involved in regulating gene expression? Programmed translational frameshifting is known or thought to play a role in regulating the expression of a number of genes in other organisms (reviewed in references 13 and 32), and a number of reports have proposed that it also may be involved in regulating gene expression in euplotids (for examples, see references 1, 11, and 20). At the present time, there is no direct experimental evidence for frameshifting playing a role in euplotid gene regulation. However, if the 5'-AAA-TA(A/g)-3' frameshift motif is the only sequence element required for a +1 frameshift, a regulatory function for frameshifting would appear unlikely, as it would presumably influence the expression of a significant fraction of the genes in the genome. It is possible that different classes of accessory regulatory elements exist, with particular elements shared by subsets of genes involved in a common cellular process, which would enable their coordinate regulation. There are some indications of genes with related functions being overrepresented among the currently small number of known euplotid frameshift genes, but the significance in each case is still unclear. First, three of the known frameshift genes encode putative protein kinases, but there is as yet no evidence that these three enzymes are involved in the same cellular process or pathway. Second, two of the known frameshift genes, TERT (20, 29, 42) and the p43 La motif protein gene (1), are involved in telomere-related functions. However, telomeres have been intensely studied in euplotids, so the identification of these two genes may simply be a matter of representation of telomere-related genes in the overall small sample from this organism. Thus, it is still difficult to differentiate between +1 frameshifting serving a regulatory function in euplotids, as opposed to these organisms having evolved a relatively efficient frameshift mechanism that tolerates the existence of frameshift sites within genes. More detailed studies of the expression of individual genes under different conditions, as well as expansion of the list of genes requiring frameshifts for expression, will be needed to resolve this issue.
I thank Stephen Walsh for his help with statistical analysis and Donna Cortezzo and Sara Avatapalli for technical assistance.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»