Previous Article | Next Article ![]()
Eukaryotic Cell, December 2007, p. 2269-2277, Vol. 6, No. 12
1535-9778/07/$08.00+0 doi:10.1128/EC.00044-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.
,
Department of Biological Sciences,1 Center for Gene Structure and Function,2 Department of Computer Science, Hunter College, City University of New York, New York, New York 10021,3 Department of Biology, Brooklyn College of City University of New York, Brooklyn, New York 112104
Received 9 February 2007/ Accepted 3 October 2007
|
|
|---|
|
|
|---|
|
|
|---|
|
|
|---|
Query-specific matrices allow identification and alignment of homologous sequences among fungal cell wall proteins (9). We have compared S. cerevisiae cell wall proteins to open reading frames (ORFs) in 17 other complete genomes and to the NR database (36). This large collection of ORFs supports the comparison of cell wall-related sequences in diverse fungal species. Our results show that fungal walls are both conserved in origin and diversified in composition.
|
|
|---|
These 103 proteins were then used as queries in searches against the S. cerevisiae genome sequence database (SGD). To avoid low-complexity corruption, all searches reported here used BLAST with the query-specific gtQ scoring matrices (9). A gtQ matrix compensates for overrepresentation of any residue, while it preserves the negative value of the entire matrix. Thus, it is much more highly discriminating but does not sacrifice sensitivity. A transitive closure procedure was conducted with multiple rounds of searches against the S. cerevisiae genome, until no new homologous ORFs were identified. In a given round, new ORFs that participated in high-scoring pairs with e-values of <10–5 became the query set for the next round, still against the same database. This process continued until no additional proteins with e-values below the specified cutoff were obtained (22, 62). This method identified a total of 171 proteins, including the original 103 queries as S. cerevisiae cell wall components or their paralogs. To these genes we added 16 other sequences that are not annotated as cell wall in GO in SGD but are known to be associated with wall synthesis and biogenesis (28, 29, 35). These 187 ORFs are listed in Table S1 of the supplemental material and are referred to in the remainder of this paper as S. cerevisiae cell wall-related proteins.
Cell wall-related proteins from Yarrowia lipolytica. We needed an independent set of queries that could be used to test the relationship of homology occurrence and phylogenetic distance (see Fig. 4, below). Therefore, we identified a set of putative cell wall-related proteins from the ascomycete Yarrowia lipolytica as query sequences. Because annotation is sparse in species other than S. cerevisiae, we used the ORFs predicted to have GPI anchors, which are common in cell wall proteins in S. cerevisiae, C. albicans, and other fungi (61). Y. lipolytica putative cell wall-related sequences were identified as follows: we used the GPI-SOM server to find 237 proteins with predicted GPI signals and secretion signals (19). Of these 237 proteins, the tmHmm prediction server (http://www.cbs.dtu.dk/services/TMHMM/) identified 188 proteins without transmembrane domains. The Fungal BIG-PI server (http://mendel.imp.ac.at/gpi/fungi_server.html) was slightly more conservative and identified a subset of 149 (79%) as potential GPI-anchored proteins (18). Like the S. cerevisiae cell wall proteins, 115 of the Y. lipolytica proteins contained regions with high S and T content: >50 consecutive residues with >30% S and T.
![]() View larger version (8K): [in a new window] |
FIG. 4. Relationship between homolog identification and phylogenetic distance. For each tested genome, the fraction of query sequences that had a homolog was plotted against the phylogenetic distance derived from Fig. 1. The query protein sets were all S. cerevisiae cell wall-related proteins () or Y. lipolytica GPI-anchored proteins ( ). The top, left-most point ( ) corresponds to the searches of S. cerevisiae queries against the S. cerevisiae genome and to searches of the Y. lipolytica queries against the Y. lipolytica genome.
|
Species tree. The power of any phylogenetic analysis depends on a comparison to a reliable phylogenetic tree for the sampled organisms. Therefore, a tree was generated based on a comparison of amino acid sequences in all orthologous ORFs in all 18 fungal genomes (31). A species tree was produced from a neighbor joining of the amino acid identity distance matrix (31).
Degree of conservation. Presence/absence of an ORF on an internal node of the tree was inferred from maximum parsimony, with a postorder tree traversal algorithm (20). The degree of conservation of an ORF (see Fig. 3 and 5, below; see also Table S1 in the supplemental material) was determined as the proportion of the total length of branches on which this ORF is present over the total tree branch length. The branches where an ORF is present are counted only if the ORF is present on both the parent and the child nodes.
![]() View larger version (43K): [in a new window] |
FIG. 3. Presence of homologs and orthologs for two functional groups of wall-related genes in 17 fungi. The query sequences were S. cerevisiae cell wall-related proteins in the adhesin and glycosylase functional groups. The presence of one or more homologs with e of <10–10 is shown as a gray square, and the presence of an ortholog is shown as a black square (31). On the right is the total conservation score for each gene, with the group mean conservation score at the bottom of the group. Cellular locations and abbreviations are as follows: CW, cell wall; PM, plasma membrane; ER, endoplasmic reticulum; UN, unknown. The column labeled GPI anchor shows those proteins predicted to have GPI anchors.
|
![]() View larger version (19K): [in a new window] |
FIG. 5. Degrees of conservation for 13 functional groups of cell wall proteins. The width of each bar is proportional to the number of ORFs in that group. The degree of conservation was calculated as described in Materials and Methods.
|
|
|
|---|
![]() View larger version (47K): [in a new window] |
FIG. 1. Neighbor-joining phylogenetic tree based on amino acid substitution rates for all orthologs in 18 fungal genomes.
|
48% amino acid differences). The functions of S. cerevisiae cell wall-related proteins and their paralogs. We identified a set of 187 cell wall-related proteins as those annotated in the SGD GPI-anchored proteins and their paralogs identified in transitive closure (9). For the 187 cell wall-related proteins, GO annotations report that 108 (63%) are localized either to the cell wall (86 proteins) or to similar cell wall-associated spaces, such as periplasmic (3 proteins) and extracellular (13 proteins). A large set (46 ORFs, 27%) has unknown location or no annotation. Only about 10% of the proteins are reported with intracellular localization, including paralogs of chaperones and glycolytic enzymes. Those two classes of proteins are primarily cytoplasmic, but some proteins in each class are also proposed to be localized in cell walls in S. cerevisiae and in C. albicans on the bases of analyses of isolated walls and/or immunoassays on intact cells (14, 30, 40, 41, 46).
The distribution of molecular function for these 187 proteins (as annotated by the SGD) includes several major classes (Fig. 2). Sixty-one proteins (33%) have no known molecular function; 33 (18%) are glycosyl transferases, hydrolases, or transglycosylases (collectively called "carbohydrate-active enzymes" in the CAZY database [http://www.cazy.org], but abbreviated as "glycosylases" in subsequent discussions) (10); 16 (9%) are involved in unfolded protein binding (called "chaperones," subsequently); 13 (7%) are structural constituents of the cell wall; and 8 (4%) have protease activity. The GO annotation for the biological process most often associated with the 187 S. cerevisiae cell wall-related proteins was "cell wall organization and biogenesis." Forty-eight proteins, including chaperones and glycosylases, were annotated this way.
![]() View larger version (39K): [in a new window] |
FIG. 2. Cellular function groups of S. cerevisiae cell wall-related proteins. The size of each sector is proportional to the number of ORFs in that group. CW, cell wall.
|
Figure 3 illustrates the occurrence of homologs and orthologs in two functional classes of cell wall proteins. Among the adhesins and invasins shown at the top in Fig. 3, those participating in mating were conserved within the closely related Saccharomyces sensu stricto species. For the FLO family of mannose-binding flocculins and invasins, homologs were also found in Saccharomyces sensu lato species and Debaryomyces hansenii, but not in the more distant yeasts or the filamentous ascomycetes. In contrast, the glycosylases at the bottom of the figure show a different conservation pattern. For many members of this large functional class, there were homologs in each of the ascomycete and basidiomycete genomes. Thus, these two functional classes of genes show different patterns of sequence conservation. Table S1 in the supplemental material lists the 187 S. cerevisiae cell wall-related proteins.
The occurrence pattern for most genes followed the expected general pattern: there were homologs in the species most closely related to S. cerevisiae, and the probability of recognizable homology decreased as phylogenetic distance increased. Figure 4 illustrates this trend; it shows decreasing homolog identification with increasing phylogenetic distance. To confirm that this trend was not specific to S. cerevisiae and its cell wall-related proteins, we carried out a similar search and analysis for homologs of 188 potential GPI-anchored wall proteins from Y. lipolytica (a set of similar size). Most of the other species were similarly distant and had homologs to about 35% of the Y. lipolytica proteins. The corresponding points in Fig. 4 cluster near the occurrence-distance curve for S. cerevisiae, except for the Y. lipolytica/E. cuniculi comparison, which identified homologs to only 10 of the Y. lipolytica proteins.
Degree of conservation is related to molecular function. The origins of cell walls can be understood in terms of the roles of the conserved and nonconserved cell wall ORFs. For each function of the 187 ORFs in the S. cerevisiae wall (Fig. 2), we determined the degree of conservation, as detailed in Materials and Methods. The different functional classes differed significantly in their degrees of conservation (Fig. 5). Several functional classes were poorly conserved, with degree of conservation values below 0.4 (on the right in Fig. 5). These protein classes included a small set of Fe transport-related proteins, the uncharacterized ORFs, cell wall structural components, and adhesins. The uncharacterized ORFs include many with known location in the cell wall but no known phenotype in gene deletions (see Table S1 in the supplemental material). Thus, these uncharacterized ORFs may encode proteins without unique or essential molecular activities or those replaceable with other gene products (17, 61). In contrast, components of the GPI synthesis pathway, lipases, proteases, metabolic enzymes, glycosylases, and chaperones were strongly conserved and were present in most tested fungi. In summary, a general observation is that the biosynthetic capability for cell walls is well conserved but noncatalytic wall components, including the adhesins, and structural and putative structural proteins are not.
Cell wall-related proteins occur across the eukaryotes. Comparisons of the S. cerevisiae cell wall-related proteins against the GenBank NR database identified homologs in 137 ascomycetes species other than the genomes used in the initial fungal comparisons. Homologs were also found in 30 basidiomycete species, 7 zygomycete species, and 2 chytridiomycetes species. The GPI biosynthetic enzymes, metabolic enzymes, and chaperones were most likely to have homologs (Fig. 6).
![]() View larger version (17K): [in a new window] |
FIG. 6. Occurrence of wall characters and homologs of S. cerevisiae cell wall-related ORFs in other kingdoms and domains. (Top) Conserved characters in walls are shown by gray shading where known to occur. Note that an absence of shading is not definitive, because it may represent a lack of data. ECM, extracellular matrix. (Bottom) Each S. cerevisiae cell wall-related ORF was the query sequence for a BLAST-gtQ search through the NR database. Homologs (gray squares) were characterized by kingdom and domain. Black squares indicate ORFs present in each of the 18 fungal genomes searched and in all fungal phyla except Glomeromycota. There was a single bacterial CHS2 homolog, and its uniqueness and e-value, <10–18, implies that it is derived from horizontal gene transfer. A horizontal line sets off the two classes of proteins that are primarily intracellular.
|
|
|
|---|
Cavalier-Smith has argued that fungal cell walls are derived from ancestral chitinous walls of desiccation-resistant cysts in many protist clades (7, 8). Certainly, chitinous cysts and spores are present in other eukaryote kingdoms, including Amoebozoa (such as Entamoeba [11]), Chromalveolata (Phytophthora [43, 44]), and Excavata (Giardia [57]). In addition, S. cerevisiae walls share extracellular disulfide cross-links and GPI-anchored cell adhesion molecules with Amoebozoa, perhaps the earliest-diverged eukaryotes (Fig. 6) (7, 26, 55, 59, 63). Cavalier-Smith's hypothesis motivated our use of genomics to identify genes that are key to wall biogenesis and structure and to test which are conserved and which are subject to adaptive selection.
The deep roots for the ascomycete-basidiomycete split show large amino acid distances among all homologs in the fungi (25) (Fig. 1). This observation implies that cell wall genes could also have diverged broadly. Therefore, the identification of homologs in some cell wall proteins among distantly related organisms implies greater-than-average sequence conservation. In other words, anciently diverged fungal clades should retain sequence similarity only in the most conserved of the ancestral cell wall genes. The sequence comparisons summarized in this study demonstrate precisely that.
Prevalence of cell wall-related sequences in fungi. The evolutionary distance between two fungal species is correlated with the divergence of their cell wall proteins (Fig. 4). The Saccharomyces sensu stricto species have homologs for at least 166 of the 187 S. cerevisiae cell wall-related ORFs, with an amino acid identity of at least 86%. The number of conserved wall genes decreases in more distant fungal clades. Saccharomyces sensu lato yeasts have homologs for at least 70% of the S. cerevisiae cell wall-related proteins; filamentous ascomycetes had fewer. Among the homologs not found in filamentous ascomycetes were genes encoding extracellular proteins, including cell wall structural proteins, adhesins, and extracellular members of the GO classification "cell wall organization and biosynthesis." Also not conserved between yeast and filamentous ascomycetes were many of the proteins involved in the cell wall stress response pathway.
The two basidiomycete genomes in our study had homologs for 68 of the S. cerevisiae putative cell wall proteins (36%). The intracellular components were mostly conserved, except for the cell wall stress response pathway components. Among the cell wall-localized proteins, all but six of the glycosylases were conserved, as were lipases and proteases. Twelve GPI-anchored proteins from S. cerevisiae had basidiomycete homologs. The large number of homologs to the Y. lipolytica GPI proteins supports this concept of conservation of GPI-anchored proteins, as previously noted in genomic analyses of several fungi (12, 13, 29). Thus, basidiomycetes share with S. cerevisiae intracellular and extracellular activities that are critical for biosynthesis and assembly of fungal cell walls.
In the search of the nonredundant NCBI database, almost every major fungal phylum contained homologs of some S. cerevisiae cell wall proteins (Fig. 6). (The single exception, Glomeromycota, is probably due to the small number of sequences in GenBank.) Thus, homologs of some of the S. cerevisiae cell wall-related proteins are likely to be present throughout the fungi.
Differences in divergence of functional classes of cell wall-related proteins. A major finding of this analysis is that that the degree of conservation among cell wall proteins is strongly correlated with the cellular roles of those proteins (Fig. 5). Many of the proteins that reside in S. cerevisiae cell walls have diverged to the extent that homologs were not recognized in organisms outside of the Saccharomyces sensu lato group. The least-conserved include the three Fit iron transport proteins and the largest class, homologs of unannotated S. cerevisiae cell wall ORFs. Homologs of unannotated sequences were often found only in the sensu stricto group (Fig. 5; see also Table S1 in the supplemental material). These poorly conserved proteins probably have important (45) but noncatalytic roles in walls.
Among the adhesins, most have homologs in the Saccharomyces sensu lato group but not in more-distant ascomycetes (Fig. 3; see also Table S1 in the supplemental material). There is anecdotal evidence that adhesins are poorly conserved, because they are subject to sexual isolation and to diversifying selection for adaptation to different environments (16, 33, 56). The structural cell wall proteins (in the CWP, TIR, PAU, and DAN families) are also poorly conserved. This observation may reflect the idea that this class is rich in proteins with very-low-complexity compositions. Within each sequence, only short segments are under selective pressure to retain sequence: the secretion and GPI signals and the relatively short glycosylation and transglycosylation sites (9, 16, 28). The majority of each sequence has low complexity and would have evolved faster than other parts of the genome (49).
Two protein classes show intermediate levels of conservation: the cell wall stress response pathways and cell wall biogenesis pathways. Their intermediate mean conservation scores result from their composition, a mixture of poorly conserved and highly conserved proteins. In general, those proteins resident in the walls are poorly conserved. Within these two classes, the plasma membrane and intracellular proteins are a mixture of conserved and nonconserved sequences (see Table S1 in the supplemental material).
Strongly conserved proteins. In contrast, the strongly conserved functional classes are key biosynthetic and processing enzymes that must be useful in wall biosynthesis and assembly. These include the glycosylases, GPI synthetic enzymes, proteases, and lipases. These proteins include many orthologs that may function in homologous roles in wall biogenesis across the fungi. Among the most highly conserved are three sets of partially redundant glycosylases: the Chs chitin synthases, the Gas transglycosidases, and the Fks β-1,3-glucan synthases. Other highly conserved components were peptidases, phosphatases, lipases, and enzymes of the N-acetylglucosamine synthesis pathway (see Table S1 in the supplemental material).
It is not surprising that chaperone sequences and triose phosphate dehydrogenases (in the metabolic enzyme class) are strongly conserved, because these are genes whose sequences are highly conserved across the biome (3, 21, 60). Their primary locations are intracellular. Although their role as cell wall proteins is less well known and somewhat controversial, it is well documented in two ascomycetous yeasts, S. cerevisiae and C. albicans (14, 30, 40, 41, 46). Given their controversial or auxiliary role in cell walls and their dual roles as wall resident and key metabolic activities, these classes could be excluded from our analysis ("chaperones" and "metabolism" in Fig. 4 and 6). That exclusion would in fact strengthen our key observation, that wall biosynthetic genes generally have conserved sequences and wall resident proteins do not.
Modes of sequence divergence in cell wall components. In contrast to the homology of cell wall biosynthetic pathways, the actual composition of cell walls is lineage specific. This dichotomy is similar to the distinction in genomics between conserved "housekeeping" genes and a faster-evolving set of "accessory" genes (23). Two evolutionary mechanisms could give rise to such diversity and lineage specificity in the "accessory-like" genes: either rapid sequence divergence or frequent loss and substitution of the proteins. In the first evolutionary scenario, a homologous core set of cell wall proteins exists, but they are conserved only in protein structure and function, not in sequence. For instance, the greatest mean amino acid distance between genomes of Saccharomyces sensu stricto species is 16% (Fig. 1). The S. cerevisiae wall proteins generally have homologs within the sensu stricto group, and so the sequences conserved in this clade must have been conserved within the timescale corresponding to this difference. Rapid evolution would make homology unrecognizable in more distant groups. This result is consistent with the observation that many of these sequences are rich in low-complexity sequences, which are evolutionarily less constrained (49). In such a case, proteins resident in fungal cell walls may have a single origin but have diverged among different fungal lineages, through either neutral or adaptive divergence.
In the second evolutionary scenario, fungal cell wall architecture is conserved as a whole, with frequent additions and deletions of specific macromolecular components during lineage diversification. In this case, fungal cell walls are more analogous to a cell organelle that has evolved independently among lineages with few underlying homologous components. Functional features of fungal cell walls have been maintained by natural selection, similar to the multiple emergences of fins among different vertebrate groups. These fins show functional homology and anatomic convergence despite their multiple origins. Possible fungal examples might be the relocalization of novel proteins to the wall following acquisition of secretion signals and GPI anchor sequences or the internal repeats that include Gln residues to be esterified to wall glucans (17). Either of these additions could happen by recombination or by insertion of foreign DNA (53, 56).
Whether the cause is a rapid sequence divergence or frequent loss and substitution, it appears beyond dispute that fungal cell wall components evolve faster than do core metabolic protein sequences (Fig. 5). Fast evolution in a large number of cell wall-related ORFs may be driven by adaptive divergence. Unlike other cell organelles, cell walls in fungi are in contact with the external environment, play a direct role in cell adaptation, and must have evolved as highly specialized and dynamic structures for colonization, host immune system evasion, signal transduction, transport, and structural maintenance. Fungi have continuously put these structures to the test through natural selection, and the selection continues. Cell wall proteins that are harmful or neutral to the cell can be mutated or removed from the genome. As a result, the organism loses a maladaptive factor and gains efficiency in its growth and replication rates. Given that natural selection favors improved fitness and that the organism can gain two benefits from a single such event, the likelihood of selection for such mutations and deletions increases.
Conservation of cell wall biogenesis. Several classes of fungal cell wall-related proteins are common to eukaryotes: glycosylases, proteases, proteins needed for GPI synthesis, and the poorly characterized Pry proteins, as well as chaperones. The presence of these classes throughout the fungi and in several eukaryote kingdoms (Fig. 6) implies that they are ancestral. These genes represent a conserved set of metabolic activities that were coordinated in the formation of cell walls. Many of the conserved glycosylases have plant homologs with annotations that show roles in plant cell wall biogenesis. These proteins include glucosidases Dse4, Spr1, and Exg1, mannosidases involved in glycoprotein biogenesis (Mns1 and Mnl1), and the Fks β-1,3-glucan synthase subunits. The roles in plant wall biogenesis include synthesis and processing of callose (β-1,3-glucan), cellulose, and glycoproteins.
The enzymes of chitin metabolism are extremely highly conserved. Two enzymes in the biosynthetic pathway for N-acetylglucosamine (Gfa1 and YMR84w) are conserved in ascomycetes and basidiomycetes, as are chitin deacetylases (Cda1 and Cda2) (see Table S1 in the supplemental material). Fungal chitin synthases (Chs) are homologous to those in other fungi, metazoa, amoebozoa, and chromalveolata (Fig. 6). This high degree of conservation is consistent with an ancestral and continuing role of chitin in cell walls of many eukaryotes (4, 7, 43, 44, 55, 57).
Given the conservation of wall biosynthetic sequences, the broad distribution of specific wall glycoconjugates, and the similarity in roles of plant and fungal glycosylases, it is reasonable to infer that a common ancestor to fungi and other eukaryotes was an organism with an extensive carbohydrate chemistry repertoire. The commonalities identified here suggest that each eukaryotic lineage has conserved the mechanism to generate diverse extracellular structures but has altered the products of the mechanism in their ancestor to form cell walls (in plants) or some other extracellular matrices (in metazoans). This conserved mechanism uses GPI synthesis and processing enzymes, carbohydrate-processing enzymes, and chaperones to enable the localization and proper folding of the proteins (3, 21, 40, 41).
In summary, genes that encode the proteins that make up fungal cell walls have evolved so fast that their homology is often not recognized, even within the Saccharomyces sensu lato group. This rapid divergence appears to be driven by the great diversity of strong selective pressures on cell interactions, including mating, colony and biofilm formation, pathogenesis, and immune escape. On the other hand, there is a conserved core of sequences that are involved in wall biogenesis throughout the fungi and in other eukaryote kingdoms. The conserved metabolic capabilities in Fig. 6 are apparently ancestral. Their conservation implies that the ability to organize a wall may predate the divergence of the plants from the opisthokonts, including the fungi.
This work was supported by NIH/NIGMS SCORE grants S06 GM 060654 and S06 GM 076168 and NIH/RCMI grant RR 030307. J.E.C. was supported in part by a fellowship from NSF MAGNET-STEM.
Published ahead of print on 19 October 2007. ![]()
Supplemental material for this article may be found at http://ec.asm.org/. ![]()
|
|
|---|
5β1 and
vβ3 integrins react with Candida albicans alcohol dehydrogenase. Microbiology 147:3159-3164.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»