| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Previous Article | Next Article ![]()
Eukaryotic Cell, June 2007, p. 1041-1052, Vol. 6, No. 6
1535-9778/07/$08.00+0 doi:10.1128/EC.00041-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.
,
Aberdeen Fungal Group, School of Medical Sciences, Institute of Medical Sciences, University of Aberdeen, Aberdeen AB25 2ZD, United Kingdom,1 Unité Biologie et Pathogénicité Fongiques, INRA USC 2019, Institut Pasteur, Paris, France,2 Laboratoire de Parasitologie-Mycologie, Service de Microbiologie, Hôpital Necker-Enfants Malades, Université Paris 5-René Descartes, Faculté de Médecine, Paris, France,3 Laboratory for Mycology, Division of Research and Diagnostics, Center for Disease Control, Taipei, Taiwan,4 Dipartimento di Biologia, Via San Zeno 35-39, Università di Pisa, 56127 Pisa, Italy,5 The Peter Medawar Building for Pathogen Research and Department of Zoology, University of Oxford, OX1 3SY Oxford, United Kingdom6
Received 6 February 2007/ Accepted 23 March 2007
| ABSTRACT |
|---|
|
|
|---|
/
types and was lowest for heterozygous a/
types. The tree of clades defined by MLST was not congruent with trees generated from the individual gene fragments sequenced, implying a separate evolutionary history for each fragment. Analysis of nucleic acid variation among loci and within loci supported recombination. Computational haplotype analysis showed a high frequency of recombination events, suggesting that isolates had mixed evolutionary histories resembling those of a sexually reproducing species. | INTRODUCTION |
|---|
|
|
|---|
C. albicans possesses homologs of the mating type genes in Saccharomyces cerevisiae (25) and can be induced to undergo a mating process in which nuclei from two diploid cells fuse to form a tetraploid (29, 36). However, the fused nuclei randomly lose chromosomes to return to the diploid state rather than undergoing a true process of meiosis (3, 4), probably because C. albicans lacks some of the genes required for meiosis in other yeasts (70). The mating process occurs between diploid strains homozygous for the mating types a/a and
/
, which arise spontaneously in a minority of clinical isolates (35, 38, 47, 63). However, mating usually occurs only after strains of opposite mating types have been induced to undergo a transition from the commonly observed, spheroidal, white cell form to the elongated, papule-surfaced, opaque cell form. This form survives at room temperature rather than body temperature and appears to be primed to undergo a mating process (34, 44, 69). Although the opaque cell form of C. albicans is associated in the laboratory with temperatures lower than 37°C, coinjection of genetically marked white cells of opposite mating types intravenously into mice led to mating in vivo (26), which suggests that the white-opaque switch might also occur in injected tissues. Recent evidence suggests that pheromone signals from rarely occurring opaque cells may induce biofilm formation among white cells of opposite mating types, which then enhances the close contact needed to induce mating in vivo (16).
Haplotype analyses of C. albicans strains show that while the dominant mode of reproduction in the species is clonal, recombination events also occur (65). Moreover, the spontaneous occasional occurrence of homozygous a/a or
/
isolates in the course of longitudinal sequences of a/
isolates from single individuals (47) strongly suggests that C. albicans does indeed undergo steps known to be associated with mating and white-opaque switching in colonized and infected patients. Full and partial chromosomal loss and replacement are well demonstrated phenomena for C. albicans (12, 31, 58, 72), and microvariation (also sometimes called microevolution) has been demonstrated by various strain typing approaches (8, 37, 47, 56). We therefore previously suggested that the population of cells colonizing or infecting a particular site in a patient comprises a mixture of cells with nearly identical genomes (47).
Multilocus sequence typing (MLST) has gained widespread acceptance as a tool for exploring strain-level differences in microbial species; it has been used for typing of many pathogenic bacteria (40), and systems have now been devised for several pathogenic fungi, including C. albicans (6, 7, 9), Candida glabrata (18), Candida krusei (28), and Candida tropicalis (64). MLST combines a high discriminatory index with exceptional portability since MLST data can be accessed and updated on the Internet (linked via http://www.mlst.net/ or http://pubmlst.org/). Sequence data from multiple loci can be used not only to analyze clonality and recombination for strain populations within a species but also to provide evidence for species-level differentiation (67). A population genetic analysis of C. albicans MLST data based on 416 isolates from separate sources (63) showed that the species could be divided into several clades and that clades differed in the proportions of isolates they included from different geographical and anatomical sources. Since our C. albicans MLST database has now grown to include well above 1,000 isolates, with a more equitable distribution of isolates from different geographical sources, we undertook a fresh analysis of the data to provide a more robust reference basis for definition of MLST clades, to characterize further the prevalence of genetic recombination events, and to explore further the mechanisms underlying microdiversity in C. albicans.
We show here that 97% of the isolates could be assigned to one of 17 clades; that the proportions of A, B, and C genotypes, defined by the presence or absence of an intron in the ribosomal DNA region, differed significantly among clades; that clades were enriched with isolates from particular geographical areas; that the five most populous clades differed significantly in their proportions of commensal and pathogenic isolates; and that phylogenies of the seven fragments sequenced were not congruent, implying independent evolution of the fragments. A computational haplotype analysis of the sequences indicated that recombination events have led to isolates with mixed evolutionary histories, resembling the picture seen with a sexually reproducing species.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
, a/a, or
/
, was determined by PCR (66). Susceptibility testing, performed on 658 of the isolates, was done by EUCAST methodology (14, 15). Isolates were defined as having reduced susceptibility to an agent, on the basis of Clinical Laboratory Standards Institute breakpoints (45), when the MICs of fluconazole, itraconazole, voriconazole, and flucytosine were >8 µg/ml, >0.13 µg/ml, >1 µg/ml, and >4 µg/ml, respectively. Analysis of MLST data. The MLST results for single nucleotide polymorphisms (SNPs) in the seven sequenced loci were concatenated into a single sequence and converted with proprietary software to represent each SNP as two bases, as previously described (63, 64). This procedure generated sequences that could be handled by MEGA software, which cannot analyze heterozygous code data for nucleotide P distances. For any pair of diploid isolates, the result at each SNP can be homozygous and identical between the isolates, heterozygous and identical, homozygous and different, or heterozygous in one isolate and homozygous in another. For example, the sequencing result (in IUPAC single-letter code) for a given SNP across a set of strains might appear as A, T, or W (A or T). Concatenated data for each SNP in the seven C. albicans genes were therefore rewritten twice for homozygous (A, C, G, or T) data or once each for the two component bases for heterozygous (K, M, R, S, W, or Y) data. This procedure was the functional equivalent of scoring a pair of results with a 1 for homozygous or heterozygous identical data, 0 for homozygous different data, and 0.5 when one polymorphic site had a heterozygous result and the other had a homozygous result and then creating a difference matrix. The concatenated sequences were used to generate a single-linkage dendrogram for the 1,391 isolates by the unweighted-pair group method using average linkages (UPGMA), determined by P distances, as implemented by MEGA3 software (32). Clades were numbered to be consistent with an earlier publication on population structures (63). Clonal clusters of isolates that differed in sequence at only one of the seven loci were determined by eBURST, version 3.0 (21, 62). Analysis of variance (ANOVA) was done with the general linear model module of SPSS, version 14.0.
The congruence of maximum parsimony trees generated for each of the seven gene fragments sequenced for MLST was determined by an approach previously described for bacterial MLST data (20, 24). An MLST data set was chosen for 33 isolates that represented the diversity of the whole panel of isolates. The isolates were examples of the putative founding DSTs for 30/33 eBURST clusters which contained three or more DSTs and for which a founding DST was designated by the software. For three further eBURST clusters with multiple candidates as founding DSTs, a DST was chosen at random. Congruence tests were done by the method of Holmes et al. (24), with the software package PAUP*. For each gene, the optimum maximum likelihood (ML) tree was found. The data for each gene were then fitted in turn to the ML trees for each of the other six genes, and an ML score was calculated. Finally, the data for each gene were fitted to 200 trees of random topology, and ML scores were calculated.
The seven genes used for MLST analysis each showed multiple SNPs. FastPhase analysis of the SNP data generated a list of all haplotypes within the 1,391-isolate sample. If an isolate was heterozygous for zero or one SNP in a particular gene, then the SNP haplotypes for that gene could be determined unambiguously. If there were two or more heterozygous SNPs, the software employed a probabilistic approach to determine the likeliest haplotypes. If there was no recombination between haplotypes, nor any recurrent or reverse mutation, then n SNPs were expected to result in n + 1 haplotypes.
Population genetic tests were carried out by using the Arlequin package, version 3.01 (19). Haplotypes were inferred from MLST genotype data by FastPhase (54). Perl scripts were used for data simulation and for detection of putative recombinant haplotypes (see Table S2 in the supplemental material).
| RESULTS |
|---|
|
|
|---|
Designation of clades. eBURST analysis of the MLST data assigned the 1,391 C. albicans isolates to 53 clonal clusters of DSTs differing from each other by a single genotype and to 368 singleton DSTs. Details of the clonal clusters that included more than three DSTs are summarized in Table 2. Clusters 24 to 33 each comprised three DSTs, and clusters 34 to 53 each comprised two DSTs. The largest clonal cluster, cluster 1, based on DST 69 as the putative founding type, is illustrated in Fig. 1. Its complex structure was typical for the 10 largest clonal clusters and suggests a considerable level of recombination events among the DSTs. A highly clonal population would be expected to generate noncomplex clusters consisting mainly of a single DST.
|
|
|
|
Characteristics of clades. To determine the statistical significance of the differentially distributed properties of isolates in the 17 clades shown in Table 3, a univariate ANOVA was run, with the clade number as the dependent variable and with ABC type, MAT type, geographical origin, anatomical origin, decade of isolation, and reduced susceptibility to the four antifungal agents listed in Table 3 as fixed factors. The data were analyzed with the singletons excluded and with the anatomical source omitted for isolates of animal origin. The ABC type (P < 106) and geographical origin (P = 0.005) were highly significant factors relating to clade numbers, while the MAT type (P = 0.88), decade of isolation (P = 0.61), anatomical origin (P = 0.72), and reduced susceptibility to fluconazole (P = 0.28), itraconazole (P = 0.83), voriconazole (P = 0.27), and flucytosine (P = 0.45) did not show significant associations with clade designations.
The clades obviously differed considerably in the proportions of ABC types represented (Table 3). Clades 1 and 2 comprised >93% type A isolates: among 56 isolates of the most common clade 1 DST, DST 69, for which ABC type data were available, only a single example of type B was found. Clades 7 and 13 consisted entirely of type A isolates, and clade 9 was highly enriched with type B isolates. Clades 3, 6, 10, 14, and 16 comprised >60% isolates of type B, while type C strains showed their highest prevalence in clades 4, 5, 12, and 15.
Among 1,294 isolates typed for MAT status, 110 (8.5%) were homozygous, with the remaining 1,184 isolates being heterozygous. MAT type
/
(78 isolates) was more than twice as common as type a/a (32 isolates). Reduced susceptibilities to fluconazole, itraconazole, voriconazole, and flucytosine were all significantly associated with MAT homozygosity (
2 test; P < 0.0001 for each agent). Among 592 MAT heterozygous (a/
) isolates, 27 (4.6%), 21 (3.5%), 9 (1.5%), and 23 (3.9%) showed reduced susceptibility to fluconazole, itraconazole, voriconazole, and flucytosine, respectively. The corresponding data for 24 susceptibility-tested a/a isolates were 10 (41.7%), 5 (20.8%), 4 (16.7%), and 6 (25.0%), and those for 42
/
isolates were 11 (26.2%), 9 (21.3%), 4 (9.5%), and 4 (9.5%). Thus, the highest prevalence of reduced susceptibility to azoles and flucytosine was found among a/a isolates, and the lowest prevalence was found among a/
isolates. None of the a/
isolates showed reduced susceptibility to all four of the agents for which resistance breakpoints were available, while four (16.8%) of the a/a isolates and two (4.8%) of the
/
isolates showed reduced susceptibility to all four agents.
Isolates in the most populous clades, clades 1 to 4, could be found from most geographical sources; nevertheless, the differential enrichment of these and other clades with isolates from different areas was statistically significant. North American isolates were predominantly assigned to clades 1 and 3 (Table 3). Among the isolates in clades 2 and 6, 70% came from the United Kingdom (70% of all UK isolates were assigned to clades 1, 2, and 4). In contrast, 59% of isolates in clade 10 originated from France and other European countries outside the United Kingdom, although these made up only 11% of all the isolates from these regions. Isolates from Southeast Asian countries and Japan dominated clades 14 to 17 and the singletons, while South American isolates made up 29% of those in clade 8. Among isolates from Africa, 40% were assigned to clade 4. Although the clade distributions of isolates from England and Scotland were generally similar, isolates from Scotland were less than half as common as those from England in clades 8 and 10 and more than twice as common in clade 9. Particular mention should be made of the isolates that constituted clade 13, which was the cluster least similar to the remainder of the isolates. Most of the clade 13 isolates were examples of a phenotypically distinct variant of C. albicans that was originally proposed as the separate species Candida africana (68). Of the 14 isolates in clade 13, plus a 15th isolate, P2246, which was more closely related to the clade 13 group than to other types but fell outside the 0.04 P distance clade cutoff, 12 were vaginal isolates and 2 came from the penis; only one clade 13 isolate was from a blood culture. Nine of the 15 isolates were of African origin (Madagascar and Angola), but 1 was from Japan, 2 were from the United Kingdom, 2 were from Germany, and 1 (the blood isolate) was from Chile. Only 2 of the 15 isolates, the one from Japan and P2246, were not of DST 182. These data confirm the group as a highly distinct C. albicans clade with global distribution.
Although the differential distribution among C. albicans clades of isolates from blood, the oropharynx, the vagina, and other sources did not reach statistical significance in the univariate ANOVA, as it had in our previous analysis of 416 isolates (63), the prevalence of oropharyngeal isolates among all isolates in clades 5 and 6 (42% and 56%, respectively) was notably high, and as already described, clade 13 comprised principally vaginal isolates.
Commensal and disease-producing isolates. Although the general linear ANOVA model showed no significant associations between anatomical sources of the 1,391 isolates and their clades in the present study, our previous analysis of MLST data for a smaller isolate set did suggest such an association (63). Schmid and colleagues previously hypothesized that one group of isolates comprising the most commonly encountered strain type by Ca3 fingerprinting represents a "general purpose" strain type with a higher propensity to cause infections than other types (55). Since 19/21 isolates we were given from this strain type corresponded to our MLST clade 1, we were interested in examining whether the proportion of isolates from clade 1 associated with disease rather than commensalism differed from that proportion in other clades.
A subset of 559 isolates was chosen from the full database of 1,391 isolates on the basis of two criteria. The first was that the isolates originated from patients in Western Europe. The second was that the raw information provided with the isolates allowed them to be assigned to one of the following three classes: commensal isolates, blood isolates, and isolates obtained from superficial Candida infections. Results from a single region were analyzed to minimize possible geographical bias in the data, and only the Western European isolates provided adequate numbers for the analysis. Results were analyzed for 327 European blood isolates, 177 commensal isolates (132 from the oropharynx, 35 from feces, 7 from healthy vaginas, and 3 from other superficial sites), and 55 isolates from superficial infections (37 oral isolates and 18 vaginal isolates) where the commensal or pathogenic status was known unequivocally. The clade distributions of these isolates are shown in Table 4. Isolates from clades 1, 2, 3, 4, and 11 were large enough sets to be treated as separate individual groups, with the remainder of isolates (including singletons) treated as a sixth single group. By
2 analysis, the clade distributions shown in Table 4 differed significantly (P < 0.001). Comparison of only the numbers for commensal and blood isolates also showed a highly significant difference in their clade distributions (
2 test; P = 0.003); the equivalent analysis of clade distributions of commensal and superficially pathogenic isolates gave a P value of 0.013. A
2 test done exclusively with the data for the largest clades (1, 2, 3, 4, and 11) showed a P value of 0.003. As shown in Table 4, major contributors to differences in distributions were related to clade 1, which contained a lower proportion of the blood isolates and a higher proportion of the isolates from superficial infections than its proportion of commensal isolates, and to clade 4, with the opposite relative distribution. However, the proportion of isolates in clade 1 that came from blood cultures was lower than the proportion in all other clades except clade 11 (Table 4). Clade 4 was particularly enriched in blood isolates.
|
Relative to the distribution of all isolates among clades, the distribution of isolates with reduced azole susceptibility showed a higher than expected prevalence in clades 5 and 6 and in the singletons, but the difference was not statistically significant (Table 3). Among the 33 isolates with low flucytosine susceptibility, 24 (72.7%) belonged to clade 1; this most common clade accounted for 33% of isolates overall, so the prevalence of flucytosine-resistant isolates in this clade was twice as high as expected. This result was highly significant (
2 test applied to data for large clades 1 to 4 and 11 as separate groups and to other isolates as a single group; P < 0.0001).
Phylogenetic congruence tests. For the seven genes used for MLST analysis, a test of congruence of their phylogenetic histories was carried out. Based on a set of 33 isolates chosen as described in Materials and Methods, PAUP* was used to establish the optimum ML tree for each gene. The scores for these trees (ln likelihood [L]) are shown in the second column of Table 5. The data for each gene in turn were then fitted to the optimum trees for each of the other six genes, and the range of scores for these is shown in the third column of Table 5. Finally, to assess the significance of these scores, the data for each gene were fitted to 200 trees of random topology (Table 5, final column). The results show that in each case, the fit of a gene's data to the trees for the other six genes was no better than the fit to random trees. This implies that the phylogenetic histories of the seven genes are not congruent.
|
To test whether recombination between different haplotypes could be a significant source of new variation, computer scripts were written to examine systematically the haplotypes for each gene from the whole data set for 1,391 isolates. The analysis identified sets of four haplotypes in which two of the haplotypes could have arisen by a single recombination of the other two. For all seven gene fragments, numerous sets of this type were found (AAT1a, 104; ACC1, 43; ADP1, 54; MPIb, 23; SYA1, 167; VPS13, 361; and ZWF1b, 364). To test the significance of these findings, sets of randomly generated haplotypes of the same length and diversity as the real ones were tested. Recombinant sets occurred by chance less than once per simulated population, indicating that the numbers observed in the real data are highly significant (P < 106).
For each of the seven genes in turn, we analyzed the genotypes for linkage disequilibrium (LD) between each pairwise combination of SNPs within the gene (Arlequin software). All 1,391 isolates were included in this analysis. The numbers of pairwise SNP combinations and the percentages of these showing significant LD (P < 0.05) were as follows: for AAT1, 231 pairs and 31%; for ACC1, 153 pairs and 35%; for ADP1, 351 pairs and 47%; for MPIb, 435 pairs and 41%; for SYA1, 276 pairs and 50%; for VPS13, 351 pairs and 50%; and for ZWF1b, 276 pairs and 39%. The lack of complete LD within genes is consistent with the existence of recombination between haplotypes.
The analysis was repeated with only the 467 isolates in clade 1. For each gene, the number of haplotypes observed within the clade and the number of possible recombinant sets were as follows: for AAT1, 15 haplotypes and 10 possible recombinant combinations; for ACC1, 7 and 2; for ADP1, 10 and 0; for MPIb, 17 and 6; for SYA1, 24 and 20; for VPS13, 30 and 122; and for ZWF1B, 28 and 76. Within clade 1, the haplotype diversity was about 20 to 30% of that of the entire isolate collection, but for all genes except ADP1, there was still evidence for recombination between haplotypes. This suggests that the clade has not arisen by simple clonal expansion and mutation.
Evolution of eBURST clusters. The largest eBURST cluster (Fig. 1) was examined in terms of the types of changes that have occurred between linked strains, i.e., those that differed at only one of the seven fragments used for MLST, regardless of the number of SNP differences within that fragment. Approximately half of the mutational events observed could be explained by a loss of heterozygosity in the putative derived isolate relative to the putative founder; in these cases, the founder had two different haplotypes for the gene in question, and the derived isolate had two identical haplotypes. The remaining events could not be explained by this mechanism but could be a result of mutation (at one or more SNPs), mitotic recombination, or genetic exchange with another isolate.
Population genetics analysis. The putative founders of the 33 largest eBURST clusters were selected as a group of isolates that are dissimilar to each other and that represent much of the diversity of the whole set. Examination of their MLST genotypes showed that there were 52 SNPs (out of the total of 171) where the minor allele was sufficiently common that all three possible SNP genotypes were present among the 33 isolates. Of these 52 SNPs, the genotype distributions for 39 did not deviate significantly from Hardy-Weinberg equilibrium (P > 0.05). These 39 SNPs included at least 2 SNPs in each of the seven genes.
For each of the seven genes, one or two SNPs that showed the closest correspondence to Hardy-Weinberg equilibrium were used for an analysis of inter-SNP LD. A total of 11 SNPs were analyzed. Among the 55 inter-SNP pairs, only 7 showed significant LD (P < 0.05).
| DISCUSSION |
|---|
|
|
|---|
The epidemiological implications of surveys such as the present study are limited by the fact that the isolates typed do not represent systematically obtained material from defined sources. However, the number and diversity of the 1,391 isolates represented in the present test panel give us reason for confidence in the main conclusions of the study. We have divided the isolates into 17 clades in a way that leaves a small minority of isolates characterized as singletons. The previous analysis of MLST data for 416 C. albicans isolates designated 12 clades (63). As the database of sequence data grows, there will be an inevitable consolidation of information to define groups of C. albicans clades. The point at which we have set the boundary for clade differentiation is slightly broader for the present large data set than that for the previous analysis. While all but 5 isolates assigned to 1 of 12 clades in the previous analysis retained their clade assignments, the new analysis resulted in 45 isolates previously designated singletons becoming members of definable clades. Notable among these isolates is WO-1, the original white-opaque switching isolate (2), which is the second C. albicans strain to have its genome sequenced (http://www.broad.mit.edu/annotation/genome/candida_albicans/Home.html). WO-1 now emerges as a member of clade 6, unlike SC5314, the first strain used for C. albicans genome sequencing, which belongs to clade 1 (30).
Clade 1 is the most populous strain group in this and the previous study and correlates with the most populous cluster observed in various studies based on DNA fingerprinting and other genome-based technologies (39, 49, 55, 61). This demonstrates that the most common, globally distributed C. albicans strain types are those of MLST clade 1, and DST 69 is both the most frequently encountered member of clade 1 and its putative founding type. The notion has been advanced that C. albicans clade 1 strains may have a higher propensity than other types to undergo the change from commensal to pathogen, notwithstanding the requirement for reduction in host defenses needed to permit the change of status (55). It seems reasonable to hypothesize that clade 1 strains have a high capability for spread among the human population, since examples of DST 69 alone were isolated in North and South America, Japan, Hong Kong, Malaysia, the Middle East, and New Zealand as well as from the United Kingdom and Europe, which were the geographical sources of most of our isolates. It is less evident that clade 1 isolates have an enhanced ability to cause infection relative to other types. In our analysis limited to European isolates to minimize geographical influence, the proportion of bloodstream isolates assigned to clade 1 was smaller than the proportion of commensal isolates; however, a larger proportion of isolates from superficial infections was found in clade 1 (Table 4). In contrast, clade 4 could be regarded as particularly rich in blood isolates. Statistical analysis of the proportions of commensal and invasive isolates in different clades, as shown in Table 4, gave significant outcomes, even when subsets of the data were analyzed separately. We interpret these findings as indicative of a possible clade-related differential potential for strains to be encountered as commensals, superficial pathogens, and agents of disseminated infection. The notion that clade 1 isolates are very widely distributed and are particularly commonly found as commensals and superficial pathogens, but less often as pathogens of deep tissues, is compatible with the notion that these strains are better adapted to colonize and invade epithelial surfaces, whereas bloodstream dissemination is more likely to be a consequence of severe host immunosuppression than of virulence properties intrinsic to the fungus.
Geographical differences among C. albicans strain types are to be expected, since evolutionary changes from founding types should mainly have occurred independently in separated regions. However, the term "geographical enrichment" of different strain types adequately summarizes the findings with C. albicans. Because the species exists primarily as a commensal of humans and other warm-blooded animals, the extent to which populations remain geographically independent will always be limited by movements of the hosts. Thus, in our study, clade 2 was particularly rich in isolates from the United Kingdom, which made up 70% of the clade; more than 20% of isolates in clades 1 and 3 came from North America; isolates from the Far East were predominant among those in clades 14 to 17; and so on. A study of isolates from Taiwan by MLST clearly demonstrated differences between these and European isolates (11), but the extent of the distinction becomes diluted when these data are included in a larger data set such as the one presented here. Thus, no single clade could be described as comprised uniquely of isolates from a single geographical region. Prior work based on ABC subtyping alone found geographical variations in the prevalence of types A, B, and C (42), and this observation can be amplified by the knowledge that many MLST clades are composed mainly of isolates of the A or B type (Table 3). Care is needed in interpretation of clade geographical enrichments, since neither this nor previous studies demonstrating the effect have used panels of isolates matched for all properties except geographical origin and may therefore be geographically biased. Ca3 fingerprinting identified a clade originally described as an African clade (5), but the isolate panel tested in that study included mainly isolates from Africa. African clade isolates cluster in clade 4 by MLST. Among the 182 isolates of clade 4 in the present study, 14.3% originated from Africans and 37.4% originated from the United Kingdom, which suggests that clade 4 should not be described as Africa specific. However, the 14.7% prevalence of African isolates in clade 4 was higher than the prevalence of African isolates in the other four most populous clades (1 to 3 and 11), confirming at least an African enrichment of this clade.
A more distinct specificity towards the African continent is also seen in clade 13, where 8 (57%) of 14 isolates were of African origin. These isolates represent the clade once proposed as the new species C. africana (68). Clade 13 isolates formed a cluster well separated from the other types (Fig. 1). It is notable that isolate P2246 (ABC type B, DST 797) differed substantially from the ABC type A, DST 182 or 782 that characterized the 14 isolates in clade 13, but still clustered closely with clade 13 by UPGMA. This isolate, previously numbered AM335, also differed notably from other atypical African isolates in a prior study based on a combination of molecular phylogenetic methodologies (22).
Our findings concerning antifungal resistance among the C. albicans isolates are not novel: the strong association of flucytosine resistance with clade 1 has been described previously (17, 63), as has the association of flucytosine and azole resistance with MAT homozygosity (51, 53, 63). The association has been explained in terms of homozygosity of a particular allele of TAC1, which upregulates C. albicans efflux pumps and, like the MAT locus, is located on chromosome 5 (13). The present study does, however, suggest that the propensity towards antifungal resistance is higher among a/a isolates than among
/
isolates, even though the latter type is more common in the whole population. Our data also serve as a reminder that the majority of C. albicans isolates showing resistance to azole antifungal agents were obtained from oral samples from HIV/AIDS patients in the 1990s. The overall prevalence of antifungal-resistant isolates can only be estimated from a random sample of clinical isolates, not from a partly selected isolate panel such as the one used in this study.
Our analysis of haplotypes for putative recombination events and the lack of phylogenetic congruence between the seven gene fragments sequenced point to apparently separate evolutionary histories for these seven genes, with recombination events intervening sufficiently often to remove what would otherwise be expected to be a generally clonal evolutionary pattern in an asexually reproducing species. Mitotic recombination between homologous chromosomes, followed by parasexual chromosome exchange, could explain the high level of generation of new haplotypes. This scenario is difficult, if not impossible, to distinguish from true sexual reproduction, since either process could result in similar outcomes. Since chromosomal loss and reduplication events are already well evidenced in C. albicans (31, 57, 71, 72), it will be interesting to test examples of strain pairs differing only by a loss of heterozygosity at one MLST locus for complete or partial aneuploidy by sequencing markers across whole chromosomes or by comparative genome hybridization (57). The possibility of cryptic mating events may also contribute to the production of a population in which the most diverse strains are to a large extent in Hardy-Weinberg equilibrium, with a lack of LD between loci. In this respect, C. albicans resembles a sexually reproducing species (16, 26, 33, 36, 59, 60). Superimposed on these infrequent genetic recombination events are many small incremental changes, such as losses of heterozygosity and accumulations of single mutations, that can be seen by examining the relationships of strains within eBURST clusters.
These genetic events are the most probable underpinnings of the tendency of C. albicans isolates to be highly similar, but not always indistinguishable, among members of families (8) and among sequential samples from individuals (47). The maintenance of high levels of genetic diversity in the absence of frequent sexual matings may characterize success for survival of a fungus so widespread as a commensal that becomes a pathogen only when host antimicrobial defenses are impaired.
| ACKNOWLEDGMENTS |
|---|
We are grateful to the many individuals who have supplied us with the C. albicans isolates, past and present, that formed the experimental material for this study, including J. B. Anderson, G. Chaves, M. Cuenca-Estrella, D. H. Ellis, J.-M. Gomez, G. Haase, M. Hanson, R. Hollis, E. M. Johnson, B. Jones, G. Just, C. C. Kibbler, N. Nolard, M. A. Pfaller, J.-L. Rodriguez-Tudela, J. Schmid, D. R. Soll, D. A. Stevens, F. Symoens, S. Takakura, and K. Y. Yuen. We gratefully acknowledge the skilled technical assistance of Christiane Bouchier, Diana Sharafi, and Julie Whyte. Keith Jolley (University of Oxford) provided valuable guidance in carrying out congruence analysis.
| FOOTNOTES |
|---|
Published ahead of print on 6 April 2007. ![]()
Supplemental material for this article may be found at http://ec.asm.org/. ![]()
| REFERENCES |
|---|
|
|
|---|