39
L’évolution des Eucaryotes Denis BAURAIN Département des Sciences de la Vie Université de Liège Société Royale des Sciences de Liège 20 septembre 2012

Denis BAURAIN

Embed Size (px)

Citation preview

Page 1: Denis BAURAIN

L’évolution des Eucaryotes

Denis BAURAIN

Département des Sciences de la VieUniversité de Liège

Société Royale des Sciences de Liège20 septembre 2012

Page 2: Denis BAURAIN

Plan de l’exposé

1. Qu’est-ce qu’un Eucaryote ?

2. Quelle est la diversité des Eucaryotes ?

3. Quelles sont les relations de parenté entre les grands groupes d’Eucaryotes ?

4. D’où viennent les Eucaryotes ?

Page 3: Denis BAURAIN

1Qu’est-ce qu’un Eucaryote ?

Page 4: Denis BAURAIN

Eukaryotic Cells

définition ultrastructurale : organelles spécifiques• noyau (1)• nucléole (2)• RE (5, 8)• Golgi (6)• centriole(s) (13)• mitochondrie(s) (9)• chloroplaste(s)• ...

http://en.wikipedia.org/

Page 5: Denis BAURAIN

http://reflexions.ulg.ac.be/

A eukaryotic gene is arranged in a patchworkof coding (exons) and non-coding sequences (introns).

Introns are eliminated while exons are spliced together to yield the mature mRNA used for protein synthesis.

Page 6: Denis BAURAIN

Gene DNA

pre-mRNA

Transcription

mature mRNA

Alternatif splicing

Protein

Translation

Exon1 Exon2 Exon3 Exon4 Exon5 Exon6

http://reflexions.ulg.ac.be/

In many Eukaryotes, almost all genes can lead to different proteins through a process termed alternative splicing.

Page 7: Denis BAURAIN

Endosymbiotic Organelles

NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 125

R E V I EW S

divergence indicates recurrent transfer events, fromancient to contemporary. The human genome has atleast 296 different numts of between 106 bp and 14,654bp (90% of the mitochondrial genome) that cover theentire mtDNA circle34. Other studies tallied 612 mtDNAinsertions in the human genome35, a greater numberbecause different sequence conservation criteria foridentifying numts are used in different studies36. Oldernumts are more abundant in the human genome thanrecent INTEGRANTS, indicating that mtDNA can be ampli-fied once inserted36,37 and many are organized as tandemrepeats35. Barely detectable numts are present inPlasmodium32, but highly conserved numts have now

the term ‘NUMTS’ was coined to designate these nuclearstretches of mtDNA26. Numts have been found in thenuclear genomes of grasshoppers27, primates28,29 andshrimps30, and are often mistaken for bona fidemtDNA31,32.

Eukaryotic genome sequences have more fullyexposed the scale of integrated mitochondrial andcpDNA in the nuclear genome. Fragments of organelleDNA are becoming recognized as a normal attribute ofnearly all eukaryotic chromosomes. For example, theyeast genome contains tracts with 80–100% similarityto mtDNA that range in size from 22 to 230 base pairs(bp) integrated at 34 sites33. This range of sequence

ARCHAEBACTERIA

An ancient group of organismsthat have ribosomes and cellmembranes that distinguishthem from eubacteria. Theysometimes showenvironmentally extremeecology.

NUMT

An acronym to describe nuclearintegrants of mitochondrialDNA.

Box 2 | Endosymbiotic evolution and the tree of genomes

Intracellular endosymbionts that originally descended from free-living prokaryotes have been important in the evolution of eukaryotes by givingrise to two cytoplasmic organelles. Mitochondria arose from !-proteobacteria and chloroplasts arose from cyanobacteria. Both organelles have madesubstantial contributions to the complement of genes that are found in eukaryotic nuclei today. The figure shows a schematic diagram of theevolution of eukaryotes, highlighting the incorporation of mitochondria and chloroplasts into the eukaryotic lineage through endosymbiosis andthe subsequent co-evolution of the nuclear and organelle genomes. The host that acquired plastids probably possessed two flagella113. The nature ofthe host cell that acquired the mitochondrion (lower right) is fiercely debated among cell evolutionists. The host is generally accepted by most tohave an affinity to ARCHAEBACTERIA but beyond that, biologists cannot agree as to the nature of its intracellular organization (prokaryotic, eukaryoticor intermediate), its age, its biochemical lifestyle or how many and what kind of genes it possessed120. The host is usually assumed to have beenunicellular and to have lacked mitochondria.

PlantsEukaryotes

Early diversification of algal/plant lineages andgene transfer to the host

Cyanobacteria Proteobacteria Archaebacteria

Origin of mitochondria

The host that acquired mitochondria

Ancient proteobacterium

Ancient cyanobacterium

Ancient protozoon

Origin ofplastids

Early diversification ofeukaryotic lineages andgene transfer to the host

Timmis et al. (2004) Nat Rev Genet 5:123-135

origines endosymbiotiques de la mitochondrie et du chloroplaste

Lynn Margulis

Page 8: Denis BAURAIN

Genome Reduction

NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 129

R E V I EW S

Mitochondrion-to-nucleus transfers. In yeast, a recombi-nant plasmid, which was introduced into a genome-lacking mitochondrion, was shown to relocate to thenucleus as an EPISOME (that is, not recombined intonuclear DNA) at a frequency of 2 x 10–5 per cell pergeneration80.A lower frequency (5 x 10–6 per cell per gen-eration) of episomal relocation was observed when theplasmid was integrated into the mitochondrial chromo-some81. In these experiments, the released DNA was epi-somal, indicating that release of DNA from the yeastmitochondrion is frequent, but integration might be rarein yeast nuclei because of their characteristically high levelof reliance on homologous recombination for DNAincorporation. Newer work indicates that mtDNA escapein yeast occurs through an intracellular mechanism thatdepends on the composition of the growth medium andthe genetic state of the mitochondrial genome, and isindependent of an RNA intermediate82.

Chloroplast-to-nucleus transfers in higher plants.Onlymore recently has it been possible to quantify the processof chloroplast-to-nucleus DNA transfer. To determinethe frequency of plastid DNA transfer and integrativerecombination into the higher plant nuclear genome, theplastome of tobacco was transformed with a neomycinphosphotransferase gene (neoSTLS2) that was tailoredfor expression only in the nuclear genome83 (BOX 4). In 16out of ~250,000 seedlings, the neoSTLS2 marker hadbeen integrated into a nuclear chromosome, each timein a different location, which equates to a chloroplast-to-nucleus DNA transfer frequency of one in 16,000gametes tested. The diversity of insertion locationsindicates that the marker might be transposed duringmeiotic or postmeiotic events during male gamete for-mation because the extreme alternative explanation forthese integrations — a single transfer event that is subse-quently amplified by somatic cell division — would leadto the same integration site being found in all plants withchloroplast integrants83. In agreement with the DNAintegrations induced by BIOLISTIC TRANSFORMATION, thesetransfers show no particular preference for recombina-tion sites in either the nuclear or plastid genomes.

Using a similar experimental strategy with a trans-gene in a different plastomic location, the frequency ofchloroplast-to-nucleus transposition was estimated intobacco somatic cells84. Leaf tissue from transplastomictobacco that contained an intron-less neo gene was cul-tured on medium that contained high concentrations ofkanamycin (100–400 mg/L). Twelve highly resistantplants were regenerated, 11 of which showed Mendelianinheritance of the antibiotic-resistant phenotype.After acourageous approximation of the number of regenerat-able cells that were present in LEAF EXPLANTS, a chloroplast-to-nucleus transposition frequency of one event in ~fivemillion somatic cells was estimated84. Taken at face value,the frequency in somatic cells is ~300 times lower thanthat in male gametes of the same species83. The pro-grammed degeneration of plastids that occurs duringpollen-grain development — the process that underpinsUNIPARENTAL INHERITANCE of plastid genes (FIG. 1) — mightexplain this difference.After the chloroplast genomes are

Evolutionarytime

201 141 125 181 59 79 92 100 87 73 80 81 80 80 136 3,200 ~2,000 –>7,000

Por

phyr

a

Gui

llard

ia

Odo

ntel

la

Cya

nidi

um

Eugl

ena

Chl

orel

la

Nep

hros

elm

is

Mes

ostig

ma

Mar

chan

tia

Pin

us

Flow

erin

g pl

ants

Cya

noph

ora

Syn

echo

cyst

is

Oth

er c

yano

bact

eria

Lineage diversification

Chloroplast genome reductionand gene transfer to the nucleus

Photosynthetic electron transport, respiration, ATPase

Translation

Other

Functional categories

Genome reduction

Figure 2 | Reduction of the chloroplast genome over time. We know that plastidsoriginated more than 1.2 billion years ago, because fossil red algae of that age have been found121. The ancestor of plastids was a free-living cyanobacterium and therefore must have possessed several thousand genes as did its contemporaries. Subsequent to the invention of a plastid protein import apparatus (a prerequisite for relocating genes that encode proteins required by the organelle to the nucleus), plastids relinquished most of their genes to the genome of their host cell. This gene relocation process occurredmassively at the onset of endosymbiosis and continued in parallel during algaldiversification, yet the same core set of genes (for photosynthesis and translation) has been retained in all lineages. The size of the bars shown indicates the genome sizes ofchloroplasts from a diversity of plant lineages, from red algae (Porphyra) to angiosperms(flowering plants) and Cyanophora (belonging to the most ancient lineage of photosynthetic eukaryotes), and their free-living cyanobacterial relatives (cyanobacteria). The reduction in chloroplast genome size has been mapped onto a phylogenetic tree of the relationships among these genomes. Numbers at the end of branches indicate thenumber of genes that are present in the respective genome. These genes are divided intothree functional categories that are represented by the three different colours making up the bars. Data from REF. 19.

Timmis et al. (2004) Nat Rev Genet 5:123-135; Grzebyk et al. (2003) J Phycol 39:259-267

262

DANIEL GRZEBYK ET AL.

rarachniophyte plastids appear to have been acquiredfrom secondary endosymbiosis of chlorophytes (vande Peer et al. 1996, Bhattacharya and Medlin 1998,Palmer and Delwiche 1998, Turmel et al. 1999). Ac-cordingly, the set of genes retained in secondary en-dosymbiotic plastids is almost totally included in theprimary lineage from which they descended (Fig. 1).The only exceptions are a small number of genes re-tained in secondary plastids that may not be presentin their related extant primary plastids:

minD

and

minE

in cryptophytes (Fig. 1),

ycf66

in diatoms, and

ycf13

and

ycf67

in euglenophytes. These genes are rel-ics from ancestral primary plastids that were later lost.

gene retention and losses in chloroplast genomes reflect evolutionary patterns

Whereas the presence of specific genes provides cluesabout the origin of plastids, plastid gene losses, whichare effectively irreversible, also provide evolutionaryinformation. There are 200 or fewer protein-codinggenes in primary plastids. Assuming that the numberof genes in the original primary ur-plastid was similarto that found in the extant cyanobacterium

Synechocys-tis

PCC6803 (3168 protein-coding genes; Kaneko etal. 1996), more than 93% of the ancestral endosym-biont genome was lost or transferred to the host nu-cleus in the primary plastid lineages (Fig. 2). Greenplastids exhibit the most numerous gene losses, eitherspecific losses or those in common with glaucophytes.Subsequently, patterns of gene losses occurred differ-ently at phylum level radiations within primary lin-eages and at radiations accompanying secondary sym-biotic events. Between 1% and 10% of the remainingplastid genes were lost at phylum level radiationswithin rhodophytes and chlorophytes. A larger num-ber of the remaining primary plastid genes were lostat radiations linked to secondary symbiotic events. Forexample, with the divergence of cryptophyte and ba-cillariophyte plastids from the rhodoplasts, between15% and 20% of the plastid genes were lost. Approxi-mately 30% of the genes were lost at the divergence ofeuglenoid plastids from chloroplasts. This analysis sug-gests that endosymbiotic events resulted in relativelyrapid and massive gene losses in plastids, whereas theradiations within phyla were accompanied by slowerand more gradual genomic erosion.

We used the patterns of gene loss, inferred from asimple presence/absence analysis, to construct anevolutionary tree (Fig. 3). In this analysis, we implic-itly assume that each gene retained in a plastid re-flects an ancestral status, and hence, at least within alineage, the more evolved a plastid, the more geneswere lost. The patterns of gene loss clearly separatethe red plastid lineage, with rhodophytes at the baseof the cluster, from the green lineage. Further out ofthe red lineage are the secondary endosymbiotic al-gae: the cryptophytes and bacillariophytes. In thegreen lineage, the

Mesostigma

plastid appears ancestralwithin the chlorophytes, and the pattern of gene losssuggests that land plants diverged early from chloro-

Table

2. The protein-coding gene nomenclature in algal

chloroplast genomes (adapted from Stoebe et al. 1998).

Gene code Protein function

accA,B,D

Acetyl-CoA carboxylase

acpP

Acyl carrier protein

apcA,B,D,E,F

Allophycocyanin phycobilisome

argB

Acetylglutamate kinase

atpA,B,D,E,F,G,H,I

ATP synthase

bas1

Thiol-specific antioxidant protein

bioY

Biotin synthase

carA

Carbamoyl phosphate synthetase

cbbX

Red type Calvin cycle operon

ccsA

Heme attachment to plastid cytochrome

ccemA

Envelope membrane protein

chlB,I,L,N

Protochlorophyllide reductase

clpC

/

P

Caseinolytic-like protease (Clp)

cpcA,B,G

Phycocyanin phycobilisome

cpeA,B

Phycoerythrin

crtE

Geranylgeranyl pyrophosphate synthetase

cysA,T

Probable transport proteins

dfr

Drug sensory protein

dnaB

DNA-replication helicase

dnaK

Hsp 70-type chaperone

dsbD

Thiol:disulfide interchange protein

fabH

!

-Ketoacyl-acyl carrier protein synthase III

fdx

2[4Fe-4S] ferredoxin

ftrB

Ferredoxin-thioredoxin reductase

ftsH,W

Division proteins

glnB

Nitrogen regulatory protein

gltB

Glutamate synthase (GOGAT)

groEL,ES

Chaperonins 60 and 10 kDa

hemA

5-Aminolevulinic acid synthase

hisH

Histidinol-phosphate aminotransferase

I-CvuI

DNA endonuclease

ilvB,H

Acetohydroxyacid synthase

infA,B,C

Translational initiation factors

minD,E

Homologues of bacterial cell division regulators

mntA,B

Manganese transport system proteins

moeB

Molybdopterin biosynthesis protein

nadA

Quinolinate synthase

nblA

Phycobilisome degradation protein

ndhA-J

NADH-plastoquinone oxidoreductase

ndhK

NADH-ubiquinone oxidoreductase

ntcA

Global nitrogen transcriptional regulator

odpA,B

Pyruvate dehydrogenase E1 component

pbsA

Heme oxygenase

petA,B,D,F,G,J,L,M

Photosystem electron transport proteins

pgmA

Phosphoglycerate mutase

preA

Prenyl transferase

psaA-M

PSI proteins

psbA-X

PSII proteins

rbcLg

/

r

RUBISCO large subunit, green and red forms

rbcR

RUBISCO operon transcriptional regulator

rbcSg

/

r

RUBISCO small subunit, green and red forms

rdpO Probable reverse transcriptaserne RNAseErpl1-36 Large subunit ribosomal proteinsrpoA,B,C1,C2 RNA polymeraserps1-20 Small subunit ribosomal proteinssecA,Y Preprotein-translocasesyfB Phenylalanine tRNA synthetasesyh Histidine tRNA synthetasethiG Thiamine biosynthesistrpA Tryptophane synthasetrpG Anthranilate synthase, glutamine

amidotransferasetrxA Thioredoxintsf Translational elongation factor TstufA Translational elongation factor Tuupp Uracil phosphoribosyltransferaseycf Hypothetical proteins

chloroplastes

1.gènes inutiles pour un endosymbionte

2.gènes redondants avec ceux du noyau

3.gènes transférés dans le noyau

Page 9: Denis BAURAIN

Endosymbiotic Gene Transfer

128 | FEBRUARY 2004 | VOLUME 5 www.nature.com/reviews/genetics

R E V I EW S

mixing and matching of endosymbiotically inheritedfunctions with newly evolved, eukaryote-specific bio-chemistry78,79.

In summary, retargeting of proteins among organ-elles and, most notably, the cytosol is a highly dynamicand influential process in eukaryotic evolution. Whengenes are donated from organelles to the nucleus, thereis no homing device that automatically re-routes theprotein product back to the donor organelle. Rather,chance, natural selection and lineage diversificationseem to govern the intracellular targeting fate of genesthat organelles donate to the chromosomes of theirhost. In this sense, gene donations from organelles areimportant starting material for the evolution of newgenes that are specific to the eukaryotic lineage.

Laboratory estimates of transfer frequenciesComparative genome analyses show us that gene trans-fers have occurred at different times in the past, andindicate that the process is continuing. The challengehas been to get direct empirical estimates of the fre-quency at which DNA is being transferred among cellularcompartments.

new patterns of compartmentalization in the cell11.Moreover, gene donations from organelles often lead tofunctional replacement of pre-existing and functionallyequivalent host genes, a process known as endosymbi-otic gene replacement69.

The number of proteins that are predicted to beimported into mitochondria varies markedly acrosseukaryotic groups, ranging from ~150 proteins in theparasitic fungus Encephalitozoon cuniculi to ~4,000 pro-teins in humans. Only ~50 proteins were common tothe mitochondria of all non-parasitic eukaryotes72.Similarly, the number of nuclear-encoded proteins thatare predicted to be targeted to chloroplasts differs by afactor of two between rice and Arabidopsis73. Such pre-dictions still have clear limitations but are improvingwith the accumulation of more direct experimental datafor localization73,74.

For biochemical pathways that are present in boththe original host and its endosymbionts, competitioncan ensue8,66. In some cases, the pathway of the sym-biont can predominate18,75 but hybrid pathways candevelop from both host and endosymbiont sources66,76,77.Organelle division is a prime example of lineage-specific

EPISOME

A unit of genetic material that iscomposed of a series of genesthat sometimes has anindependent existence in a hostcell and at other times isintegrated into a chromosome ofthe cell, replicating itself alongwith the chromosome.

BIOLISTIC TRANSFORMATION

A commonly usedtransformation method inwhich metal beads are coatedwith gene contructs and shotinto cells.

LEAF EXPLANTS

Small sterile sections of leaf orother plant tissue from whichwhole plants might sometimesbe regenerated.

UNIPARENTAL INHERITANCE

The mode of inheritance thatgenerally characterizes the genesof cytoplasmic organelles inwhich only one of the two sexualpartners contributes to theoffspring.

Other

Mitochondrion

Proteobacterium-likeendosymbiotic ancestor

Cyanobacterium-likeendosymbiotic ancestor

Proteins

Chloroplast

Nucleus

OrganelleDNA

OrganelleDNA

Figure 1 | Organellar DNA mobility and the genetic control of biogenesis of mitochondria and chloroplasts.Theeukaryotic mitochondrion is derived from a proteobacterial endosymbiotic ancestor but most of the genes that were originallypresent in this ancestor’s genome have been transferred to the nucleus (thick black arrow), with only a small number being retainedin the organelle (blue circle). Similarly, most of the genes from the cyanobacterial endosymbiont ancestor of the chloroplast werealso transferred to the nucleus (thick black arrow). So, as a result, cytoplasmic organelles are heavily dependent on nuclear genesand import more than 90% of their proteins from the cytoplasm (white arrows). The dotted arrows indicate how DNA ofmitochondrial (blue) and chloroplast (green) origin is still being transferred to the nucleus. Chloroplast and nuclear sequences arealso found in the mitochondrial genome but little or no promiscuous DNA is located in the chloroplast.

Timmis et al. (2004) Nat Rev Genet 5:123-135

relocalisation dans le noyaudes gènes encodés dans les

organelles endosymbiotiques

Bill Martin

Page 10: Denis BAURAIN

Woese’s Tree of LifeSSU rRNA (16S/18S)

3 domaines

1. eubactéries (Procaryotes)

2. archébactéries (Procaryotes)

3. Eucaryotes

Woese (1987) Microbiol Rev 51:221-271; http://pacelab.colorado.edu/Carl Woese

définitionphylogénétique

3 domaines1. Archaea2. Bacteria3. Eukarya

Page 11: Denis BAURAIN

2Quelle est la diversitédes Eucaryotes ?

Page 12: Denis BAURAIN

Whittaker’s System

gether are the Archimycetes of Giiu-mann (65, 57)] become phyla ofprotists, adjacent to absorptive andspore-forming organisms regarded asprotozoans, the Sporozoa and Cnidospo-ridia. Other wall-less fungi, the slimemolds, probably include at least threeseparate evolutionary lines from the uni-cellular condition (66), the true slimemolds (Myxomycetales), cellular slimemolds (Acrasiales), and cell-net slimemolds (Labyrinthulales). These have, fortheir separate origin and different orga-nization, been treated as three phyla andgrouped in a polyphyletic subkingdomGymnomycota.

This treatment results in a consider-able elevation of taxa; groups which areorders and classes in most other classi-fications become phyla here, in somecases separated into different branchesand subkingdoms. Recognition of threephyla of slime molds and seven of chy-trid and mycelial fungi is not, however,undue taxonomic inflation. The rangeof forms comprised in the fungi is wide,and the evidence of independent originof various fungal and slime mold groupsis clear. It is suggested that true fungiand slime molds are not best treated astwo phyla, that their designation as suchis in part a consequence of the effort totreat these groups within the plant orthe protoctist kingdom, and that the ex-pansion of each into a number of phylais more reasonable.

I believe that this system better rep-resents broad relationships in regard toboth levels of organization and nutritivemodes affecting kinds of organizationthan the two-kingdom and Copelandsystems. The red and brown algae andthe fungi may seem better placed, theformer as the higher plants of the sea,the latter as the third major evolution-ary direction among higher organisms.The system may further have much ad-vantage over the two-kingdom systemand some over the Copeland system inthe coherence and definable characterof the kingdoms as units of classifica-tion.

Limitations of the Five-Kingdom System

1) The distinction of the unicellularversus the multicellular and multi-nucleate conditions becomes the line ofdivision and difficulty. The phylumChlorophyta includes intergrading uni-cellular, colonial-unicellular, and multi-cellular forms and consequently violatesthe definition either of the Plantae (in10 JANUARY 1969

which it is placed here) or of the Pro-tista (in which it could with equal jus-tice be placed). The slime molds crossthe distinctions of the kingdoms in bothnutrition and organization, and offer afree choice of treatment as aberrantfungi, eccentric protists, or very peculiaranimals. The line from the unicellularto multicellular and multinucleateorganization has been crossed by a num-ber of independent phyletic lines. I sug-gest that the transition between the uni-cellular and multicellular-multinucleateconditions is a better conceptual divisionbetween lower and higher organismsthan degree of tissue differentiation. Thepractical difficulties with borderlinegroups are at least as great, and may begreater, when the separation is based onthe unicellular condition rather thandegree of tissue differentiation. There isroom for different judgments on themerits of the two lines of division.

2) The three higher kingdoms arepolyphyletic. The Rhodophyta and

Plantae

Phaeophyta are recognized to havecome from different unicellular ances-tors than the Chlorophyta; the resem-blance of these three groups as higherplants results from convergence. Thereis reason also to suspect that thesealgae supplement photosynthetic nutri-tion by absorption (67). Judged by thecriterion of monophyly, the Plantae astreated here may seem less a kingdomthan an alliance of separate groupswhich are multicellular and predomi-nantly photosynthetic. It is also truethat the Metazoa in its traditional formis polyphyletic, with separate derivationto be assumed for two and probably allthree of its subkingdoms. The kingdomFungi includes, as indicated, probablytwo convergent groups of chytrid andmycelial fungi and three of slime molds.

3) Even with the multicellular algaeand higher fungi excluded, the Protistais a grouping of diverse organisms ofdisparate directions of evolution. Neces-sarily, some protist phyla are more

Fungi Animalia

0el

Fig. 3. A five-kingdom system based on three levels of organization-the procaryotic(kingdom Monera), eucaryotic unicellular (kingdom Protista), and eucaryotic multi-cellular and multinucleate. On each level there is divergence in relation to threeprincipal modes of nutrition-the photosynthetic, absorptive, and ingestive. Ingestivenutrition is lacking in the Monera; and the three modes are continuous along numerousevolutionary lines in the Protista; but on the multicellular-multinucleate level the nutri-tive modes lead to the widely different kinds of organization which characterize thethree higher kingdoms-Plantae, Fungi, and Animalia. Evolutionary relations are muchsimplified, particularly in the Protista. Phyla are those of Table 1; but only major animalphyla are entered, and phyla of the bacteria are omitted. The Coelenterata comprisethe Cnidaria and Ctenophora; the Tentaculata comprise the Bryozoa, Brachiopoda, andPhoronida, and in some treatments the Entoprocta.

157

on

Ap

ril 1

8,

20

11

ww

w.s

cie

nce

ma

g.o

rgD

ow

nlo

ad

ed

fro

m

Robert H. Whittaker

Whittaker (1969) Science 163:150-160

Page 13: Denis BAURAIN

Eukaryotic Diversity

Walker et al. (2011) Parasitology 138:1638-1663

Page 14: Denis BAURAIN

Environmental Surveys

Lopez-Garcia et al. (2001) Nature 409:603-607

!"##"$% #& '(#)$"

*+, !"#$%& ' ()* +,- ' . /&0%$"%1 2,,. ' 333456789:4;<=

>!= ?@65A7<5B; C96;7B<5 65DE C<9 ;<=?69BF<5E 6@F< C9<= 7G:=B;9<HB6@ C96;7B<5">!= 67 IE,,,= D::?4 "C7:9 ?697B6@ F:J8:5;B5K<C 7G: I! 9:KB<5 <C 7G: K:5: LM,, H6F: ?6B9FE H?E <5 6N:96K:OE 0*"P#F:69;G:F> 65D ?GQ@<K:5:7B; 9:;<5F798;7B<5 DBF765;: =:7G<DF ?9<RNBD:D 8F 3B7G 6 S9F7 F89N:Q <C 7G: 7Q?: <C :8A69Q<7B; F:J8:5;:F?9:F:57 B5 <89 F6=?@:F4 #3:57QRC<89 9:?9:F:5767BN: ;@<5:F C9<= 6@@D:?7GF 3:9: F8HF:J8:57@Q ;G<F:5 C<9 ;<=?@:7: F:J8:5;B5K4 #G:;<=?@:7: F:J8:5;:F 3:9: 6@BK5:D 3B7G .E++I 6DDB7B<56@ .TP 9%!"K:5: F:J8:5;:F 9:79B:N:D C9<= D676H65AF4 " F8HF:7 <C .,. ;<=?@:7:F:J8:5;:F 36F 7G:5 F:@:;7:D C<9 ?GQ@<K:5:7B; 656@QFBFE 76AB5K F?:;B6@;69: 7< B5;@8D: 6 76U<5<=B;6@@Q H9<6D F6=?@: <C :8A69Q<7:F L6@@ 65D;@<F:F7 9:@67BN:F 7< <89 F:J8:5;:FO 7< =B5B=BV: 697:C6;7F 9:@67:D 7<76U<5<=B; F6=?@B5K4 W: ;<5F798;7:D DBF765;: L5:BKGH<89RX<B5B5KE!YOE =6UB=8=R?69FB=<5Q LZ[O 65D =6UB=8=R@BA:@BG<<D LZ*O79::FE 3GB;G ?9<D8;:D FB=B@69 ;<5K98:57 9:F8@7F4 /BK89: . FG<3F 65Z* 79:: DBF?@6QB5K 7G: :8A69Q<7B; =B;9<HB6@ DBN:9FB7Q C<85D4"F 36F :U?:;7:D <C ;<@DE GBKG@Q <UQK:567:D 367:9FE 7G:=6X<9B7Q <C

F:J8:5;:F 6CS@B67: 3B7G 7G: :8A69Q<7B; \\;9<35]]E 7G: D:5F:@QH965;G:D 6?B;6@ ?697 <C 7G: :8A69Q<7B; 79::^4 _<3:N:9E 3: C<85D7G9:: ?GQ@<7Q?:F H:@<5KB5K 7< 7G: :69@Q H965;GB5K ?697 <C 7G: .TP9%!" 79:: L/BK4 .6O4 `_.+TR>R&a`.T 9:?9:F:57F 6 5:3 @B5:6K::=:9KB5K B5 7G: 9:KB<5 <C 7G: "9;G:V<64 #G: @69K: @:5K7G <C B7FH965;G F8KK:F7F 7G67 B7 ;<8@D ;<99:F?<5D 7< 6 ?696FB7: 3G<F: 9%!"G6F :N<@N:D 96?BD@Q4 b7 3<8@D 7G8F H: \67796;7:D] 7< 7G: H6F: <C 7G:79:: HQ 6 @<5KRH965;G 67796;7B<5 697:C6;7E 6F B5D::D <;;89F 3B7GZB;9<F?<9BDB6M4 #GBF =6Q H: F8??<97:D HQ 7G: <;;899:5;: <C F:N:96@F?:;BS; D:@:7B<5F B5 7GBF F:J8:5;: LD676 5<7 FG<35OE 6F C6F7R:N<@NB5K:8A69Q<7B; F:J8:5;:F 69: <C7:5 ;G696;7:9BV:D HQ @:5K7G N69B67B<5T4`_.+>R&a`.. 6@F< ;<99:F?<5DF 7< 6 5:3 :8A69Q<7B; @B5:6K: <C85;:976B5 ?GQ@<K:5:7B; 6F;9B?7B<5E 6@7G<8KG B7 :=:9K:F B5 6 9:KB<5<C 7G: 79:: <;;8?B:D HQ 6=<:H<BD <9K65BF=F L!"#$%&%'($)%E*+&%'($)% <9 ZQU<V<6O L/BK4 .6O4 W: ;<8@D 9:79B:N: N:9Q C:3F:J8:5;:F 3B7G 7G: ?9B=:9 F:7 &aR./ c &aR.>2,%E 3GB;G 36F8F:D <5@Q 3B7G 7G: IE,,,R= F6=?@:E H87 6@@ <C 7G:= LG:9: 9:?9:RF:57:D HQ `_.+TR&a0.O 3:9: 9:@67:D 7< ,-./(+$'% F??4E 3GB;G 69::8K@:5<V<65 G:7:9<79<?GF C9:J8:57@Q C<85D B5 =69B5: H:57GB;FB7:F-4#G: DBN:9FB7Q <C ;9<35 :8A69Q<7:F BF =8;G @69K:9 L/BK4 .HO4 #G:

=<F7 C9:J8:57@Q 9:79B:N:D K9<8?F 3:9: 7G: 6@N:<@67:FE C<@@<3:D HQG:7:9<A<57F4W: 6@F< C<85D F:J8:5;:F H:@<5KB5K 7< C85KBE 65D 7< 7G:6=<:H<BD ?G6K<79<?GB; 6;657G69:65 96DB<@69B64 /85KB 9:6CS9=7G:=F:@N:F 6F <5: <C 7G: =<F7 :;<@<KB;6@@Q F8;;:FFC8@ :8A69Q<7B;@B5:6K:Fd 7G:Q G6N: :N:5 H::5 BF<@67:D C9<= 7G: H<77<= <C 7G:Z69B656 79:5;G L.,ET-M=O.,4 WB7GB5 7G: G:7:9<A<57FE 3: D:7:;7:DF:J8:5;:F 9:@67:D 7< 7G: @6HQ9B57G8@BDF L`_.+MR&a`.,O4 #G:F: 69:9:@67BN:@Q ;<==<5 B5 7G: F:6 65D ?@6Q 6 9<@: B5 D:;<=?<FB7B<5?9<;:FF:F ;<@<5BVB5K C6:;6@ ?:@@:7F 6@F< 85D:9 D::?RF:6 ;<5DB7B<5F..4#3< <7G:9 F:J8:5;:FE `_.+TR>R&a`>I 65D `_.++R&a`.,E D< 5<7;@:69@Q 6CS@B67: 3B7G 65Q A5<35 F?:;B:FE 65D =6Q 9:?9:F:57 5:3@B5:6K:F <C G:7:9<A<57F4 W: 6@F< 9:79B:N:D 6 ?:5567: DB67<=F:J8:5;: L`_.+TR>R&a`>+O 67 IE,,,= 7G67 ;<8@D ;<99:F?<5D 7<6 FB5AB5K ;:@@4 _<3:N:9E 6F B7 BF N:9Q FB=B@69 7< !0$12(3+-&405"-% F??4F:J8:5;:FE ;<==<5 DB5<e6K:@@67: :5D<FQ=HB<57FE 7GBF F:J8:5;:;<8@D B5F7:6D D:9BN: C9<= 6 DB5<e6K:@@67: :5D<FQ=HB<574"@N:<@67: F:J8:5;:F 3:9: HQ C69 7G: =<F7 DBN:9F: B5 <89 F6=?@:F4

WB7GB5 7G: ;<==<5@Q ?9:D67<9Q ;B@B67:FE 3: <H76B5:D 5:3 <@BK<RGQ=:5<?G<9:65 L`_.+TR>R&a`^O 65D ;<@?<D:65 L`_.+MR&a`2IO F:J8:5;:F4 W: 6@F< 9:;<N:9:D 7Q?B;6@ DB5<e6K:@@67:F:J8:5;:F C9<= 6@@ D:?7GF 65D H<7G ?@65A7<5B; C96;7B<5F L/BKF .H65D 2O4 #G:F: F:J8:5;:F 69: 9:@67:D 7< fQ=5<DB5B6@:F L<C7:5 @6;AB5K6 7G:;6 ;:@@ 36@@O 65D [9<9<;:5796@:F L7G:;6 3B7G 73< ?@67:FOE 3GB;G69: 8F86@@Q F=6@@.24 b57:9:F7B5K@QE 7G: N6F7 =6X<9B7Q <C F:J8:5;:F<H76B5:D K9<8?:D B5 73< =6X<9 ;@6D:F H:73::5 DB5<e6K:@@67:F 65D6?B;<=?@:U65F L3: G6N: 7:9=:D 7G:F: =69B5: 6@N:<@67: K9<8?F b 65DbbO4 #G:F: F:J8:5;:F 3:9: =6B5@Q 9:79B:N:D C9<= 7G: F=6@@:F7?@65A7<5B; C96;7B<5 67 6@@ D:?7GF L/BKF .H 65D 2O4 b5 7:9=F <C K:5:7B;DBN:9K:5;:E 7G: DBN:9FB7Q C<85D 3B7GB5 7G:F: K9<8?F BF :J8BN6@:57 7<

DH148-5-EKD18 Physarum polycephalum

DH148-EKB1 Diplonema papillatum

Euglena gracilis DH145-EKD11

Phreatamoeba balamuthi Entamoeba histolytica

Dictyostelium discoideum Ammonia beccarii

95

50

MicrosporidiaDiplomonadida

Trichomonadida

Crown eukaryotes

Haplosporidia

Myxozoa

250 m500 m

2,000 m

3,000 m

3,000 m (> 5 µm)

100

Blepharisma americanum Stylonychia pustulata

Tetrahymena pyriformis DH148-5-EKD6

Anophyroides haemophila DH147-EKD23

Pseudoplatyophrya nana Colpoda inflata

DH147-EKD19 DH145-EKD20

DH147-EKD20DH148-EKD27

DH148-EKD14 DH147-EKD3

DH147-EKD16 DH147-EKD6

DH144-EKD3 DH148-EKD22

DH147-EKD18 DH145-EKD10

Perkinsus marinus Crypthecodinium cohnii

Pyrocystis noctiluca Noctiluca scintillans

DH147-EKD21 Gymnodinium mikimotoi

Symbiodinium microadriaticum DH147-EKD17

Acanthometra sp.- AF063240 Chaunacanthid sp.-218

Heteromita globosa Euglypha rotunda

Blastocystis hominis Cafeteria roenbergensis

Ulkenia profunda Labyrinthuloides haliotidis

DH147-EKD10 Labyrinthuloides minuta

Schizochytrium minutum Thraustochytrium multirudimentale

DH148-5-EKD53 Developayella elegans

DH144-EKD10 Lagenidium giganteum

Hyphochytrium catenoides

Fragilaria striatula DH148-5-EKD54

Pseudo-nitzschia pungens Pelagomonas calceolata

Giraudyopsis stellifera

Vacuolaria virescens Nannochloropsis salina

Fucus gardneri

Cyanophora paradoxa

Hydra littoralis Saccharomyces cerevisiae

DH148-5-EKD21 Eupenicillium javanicum

100

50

12 84

42 89 39 38

100

70 69

48

99

100

29

57 19

100

100

Apicomplexa

Chlorarachniophyta

HaptophyceaeGreen plantsCryptophyta

AcanthamoebidaeRed algae

Chrysophyceae

DictyochophyceaeBolidophyceae

Centric diatoms

Metazoa

Fungi

Ciliates

Dinoflagellates

Pennate diatoms

Alveolates

Marine alveolateGroup II

Marine alveolateGroup I

Acantharea

ThraustochytridsLabyrinthulids

&

Heterokonts

a

b

48

96

80

100

100

100

43

!"#$%& ' !"#$%&%'($)*($+,,- ./** ,0 *&)"/1,.$2 3+1(,.13*4 $5 -**3 65."/2.$2 7".*/42,54./&2.*- &4$58 9:9 *&)"/1,.$2 9;< /=>6 4*?&*52*4@ A+* ./** +"4 B**5 43($. $5 .7,

3"/.4 /*3/*4*5.$58 .+* B"4"( 3"/. C(D "5- .+* 2/,75 C)D ,0 .+* *&)"/1,.$2 /=>6'B"4*-3+1(,8*51@ A+* ,&.8/,&3 B/"52+ C"/2+"*"D $4 5,. 4+,75@ A+$5 ./$"58(*4 2,//*43,5- ., .7,

/*3/*4*5.".$E* 43*2$*4 ,0 " 8$E*5 ."#,5F .+/** $5 .+* 2"4* ,0 63$2,%3(*#"@ G,,.4./"3

E"(&*4 "/* 8$E*5 ,5(1 B*(,7 5,-*4 2,52*/5$58 .+* 5*7 *&)"/1,.$2 4*?&*52*4@ A+* 2,(,&/

2,-* $5-$2".*4 4*" -*3.+4 ". 7+$2+ 4*?&*52*4 7*/* ,B."$5*-@ <2"(* B"/4 2,//*43,5- .,

9H 4&B4.$.&.$,54 3*/ 9:: 3,4$.$,54 0,/ " &5$. B/"52+ (*58.+@

© 2001 Macmillan Magazines Ltd

!"##"$% #& '(#)$"

*+, !"#$%& ' ()* +,- ' . /&0%$"%1 2,,. ' 333456789:4;<=

>!= ?@65A7<5B; C96;7B<5 65DE C<9 ;<=?69BF<5E 6@F< C9<= 7G:=B;9<HB6@ C96;7B<5">!= 67 IE,,,= D::?4 "C7:9 ?697B6@ F:J8:5;B5K<C 7G: I! 9:KB<5 <C 7G: K:5: LM,, H6F: ?6B9FE H?E <5 6N:96K:OE 0*"P#F:69;G:F> 65D ?GQ@<K:5:7B; 9:;<5F798;7B<5 DBF765;: =:7G<DF ?9<RNBD:D 8F 3B7G 6 S9F7 F89N:Q <C 7G: 7Q?: <C :8A69Q<7B; F:J8:5;:F?9:F:57 B5 <89 F6=?@:F4 #3:57QRC<89 9:?9:F:5767BN: ;@<5:F C9<= 6@@D:?7GF 3:9: F8HF:J8:57@Q ;G<F:5 C<9 ;<=?@:7: F:J8:5;B5K4 #G:;<=?@:7: F:J8:5;:F 3:9: 6@BK5:D 3B7G .E++I 6DDB7B<56@ .TP 9%!"K:5: F:J8:5;:F 9:79B:N:D C9<= D676H65AF4 " F8HF:7 <C .,. ;<=?@:7:F:J8:5;:F 36F 7G:5 F:@:;7:D C<9 ?GQ@<K:5:7B; 656@QFBFE 76AB5K F?:;B6@;69: 7< B5;@8D: 6 76U<5<=B;6@@Q H9<6D F6=?@: <C :8A69Q<7:F L6@@ 65D;@<F:F7 9:@67BN:F 7< <89 F:J8:5;:FO 7< =B5B=BV: 697:C6;7F 9:@67:D 7<76U<5<=B; F6=?@B5K4 W: ;<5F798;7:D DBF765;: L5:BKGH<89RX<B5B5KE!YOE =6UB=8=R?69FB=<5Q LZ[O 65D =6UB=8=R@BA:@BG<<D LZ*O79::FE 3GB;G ?9<D8;:D FB=B@69 ;<5K98:57 9:F8@7F4 /BK89: . FG<3F 65Z* 79:: DBF?@6QB5K 7G: :8A69Q<7B; =B;9<HB6@ DBN:9FB7Q C<85D4"F 36F :U?:;7:D <C ;<@DE GBKG@Q <UQK:567:D 367:9FE 7G:=6X<9B7Q <C

F:J8:5;:F 6CS@B67: 3B7G 7G: :8A69Q<7B; \\;9<35]]E 7G: D:5F:@QH965;G:D 6?B;6@ ?697 <C 7G: :8A69Q<7B; 79::^4 _<3:N:9E 3: C<85D7G9:: ?GQ@<7Q?:F H:@<5KB5K 7< 7G: :69@Q H965;GB5K ?697 <C 7G: .TP9%!" 79:: L/BK4 .6O4 `_.+TR>R&a`.T 9:?9:F:57F 6 5:3 @B5:6K::=:9KB5K B5 7G: 9:KB<5 <C 7G: "9;G:V<64 #G: @69K: @:5K7G <C B7FH965;G F8KK:F7F 7G67 B7 ;<8@D ;<99:F?<5D 7< 6 ?696FB7: 3G<F: 9%!"G6F :N<@N:D 96?BD@Q4 b7 3<8@D 7G8F H: \67796;7:D] 7< 7G: H6F: <C 7G:79:: HQ 6 @<5KRH965;G 67796;7B<5 697:C6;7E 6F B5D::D <;;89F 3B7GZB;9<F?<9BDB6M4 #GBF =6Q H: F8??<97:D HQ 7G: <;;899:5;: <C F:N:96@F?:;BS; D:@:7B<5F B5 7GBF F:J8:5;: LD676 5<7 FG<35OE 6F C6F7R:N<@NB5K:8A69Q<7B; F:J8:5;:F 69: <C7:5 ;G696;7:9BV:D HQ @:5K7G N69B67B<5T4`_.+>R&a`.. 6@F< ;<99:F?<5DF 7< 6 5:3 :8A69Q<7B; @B5:6K: <C85;:976B5 ?GQ@<K:5:7B; 6F;9B?7B<5E 6@7G<8KG B7 :=:9K:F B5 6 9:KB<5<C 7G: 79:: <;;8?B:D HQ 6=<:H<BD <9K65BF=F L!"#$%&%'($)%E*+&%'($)% <9 ZQU<V<6O L/BK4 .6O4 W: ;<8@D 9:79B:N: N:9Q C:3F:J8:5;:F 3B7G 7G: ?9B=:9 F:7 &aR./ c &aR.>2,%E 3GB;G 36F8F:D <5@Q 3B7G 7G: IE,,,R= F6=?@:E H87 6@@ <C 7G:= LG:9: 9:?9:RF:57:D HQ `_.+TR&a0.O 3:9: 9:@67:D 7< ,-./(+$'% F??4E 3GB;G 69::8K@:5<V<65 G:7:9<79<?GF C9:J8:57@Q C<85D B5 =69B5: H:57GB;FB7:F-4#G: DBN:9FB7Q <C ;9<35 :8A69Q<7:F BF =8;G @69K:9 L/BK4 .HO4 #G:

=<F7 C9:J8:57@Q 9:79B:N:D K9<8?F 3:9: 7G: 6@N:<@67:FE C<@@<3:D HQG:7:9<A<57F4W: 6@F< C<85D F:J8:5;:F H:@<5KB5K 7< C85KBE 65D 7< 7G:6=<:H<BD ?G6K<79<?GB; 6;657G69:65 96DB<@69B64 /85KB 9:6CS9=7G:=F:@N:F 6F <5: <C 7G: =<F7 :;<@<KB;6@@Q F8;;:FFC8@ :8A69Q<7B;@B5:6K:Fd 7G:Q G6N: :N:5 H::5 BF<@67:D C9<= 7G: H<77<= <C 7G:Z69B656 79:5;G L.,ET-M=O.,4 WB7GB5 7G: G:7:9<A<57FE 3: D:7:;7:DF:J8:5;:F 9:@67:D 7< 7G: @6HQ9B57G8@BDF L`_.+MR&a`.,O4 #G:F: 69:9:@67BN:@Q ;<==<5 B5 7G: F:6 65D ?@6Q 6 9<@: B5 D:;<=?<FB7B<5?9<;:FF:F ;<@<5BVB5K C6:;6@ ?:@@:7F 6@F< 85D:9 D::?RF:6 ;<5DB7B<5F..4#3< <7G:9 F:J8:5;:FE `_.+TR>R&a`>I 65D `_.++R&a`.,E D< 5<7;@:69@Q 6CS@B67: 3B7G 65Q A5<35 F?:;B:FE 65D =6Q 9:?9:F:57 5:3@B5:6K:F <C G:7:9<A<57F4 W: 6@F< 9:79B:N:D 6 ?:5567: DB67<=F:J8:5;: L`_.+TR>R&a`>+O 67 IE,,,= 7G67 ;<8@D ;<99:F?<5D 7<6 FB5AB5K ;:@@4 _<3:N:9E 6F B7 BF N:9Q FB=B@69 7< !0$12(3+-&405"-% F??4F:J8:5;:FE ;<==<5 DB5<e6K:@@67: :5D<FQ=HB<57FE 7GBF F:J8:5;:;<8@D B5F7:6D D:9BN: C9<= 6 DB5<e6K:@@67: :5D<FQ=HB<574"@N:<@67: F:J8:5;:F 3:9: HQ C69 7G: =<F7 DBN:9F: B5 <89 F6=?@:F4

WB7GB5 7G: ;<==<5@Q ?9:D67<9Q ;B@B67:FE 3: <H76B5:D 5:3 <@BK<RGQ=:5<?G<9:65 L`_.+TR>R&a`^O 65D ;<@?<D:65 L`_.+MR&a`2IO F:J8:5;:F4 W: 6@F< 9:;<N:9:D 7Q?B;6@ DB5<e6K:@@67:F:J8:5;:F C9<= 6@@ D:?7GF 65D H<7G ?@65A7<5B; C96;7B<5F L/BKF .H65D 2O4 #G:F: F:J8:5;:F 69: 9:@67:D 7< fQ=5<DB5B6@:F L<C7:5 @6;AB5K6 7G:;6 ;:@@ 36@@O 65D [9<9<;:5796@:F L7G:;6 3B7G 73< ?@67:FOE 3GB;G69: 8F86@@Q F=6@@.24 b57:9:F7B5K@QE 7G: N6F7 =6X<9B7Q <C F:J8:5;:F<H76B5:D K9<8?:D B5 73< =6X<9 ;@6D:F H:73::5 DB5<e6K:@@67:F 65D6?B;<=?@:U65F L3: G6N: 7:9=:D 7G:F: =69B5: 6@N:<@67: K9<8?F b 65DbbO4 #G:F: F:J8:5;:F 3:9: =6B5@Q 9:79B:N:D C9<= 7G: F=6@@:F7?@65A7<5B; C96;7B<5 67 6@@ D:?7GF L/BKF .H 65D 2O4 b5 7:9=F <C K:5:7B;DBN:9K:5;:E 7G: DBN:9FB7Q C<85D 3B7GB5 7G:F: K9<8?F BF :J8BN6@:57 7<

DH148-5-EKD18 Physarum polycephalum

DH148-EKB1 Diplonema papillatum

Euglena gracilis DH145-EKD11

Phreatamoeba balamuthi Entamoeba histolytica

Dictyostelium discoideum Ammonia beccarii

95

50

MicrosporidiaDiplomonadida

Trichomonadida

Crown eukaryotes

Haplosporidia

Myxozoa

250 m500 m

2,000 m

3,000 m

3,000 m (> 5 µm)

100

Blepharisma americanum Stylonychia pustulata

Tetrahymena pyriformis DH148-5-EKD6

Anophyroides haemophila DH147-EKD23

Pseudoplatyophrya nana Colpoda inflata

DH147-EKD19 DH145-EKD20

DH147-EKD20DH148-EKD27

DH148-EKD14 DH147-EKD3

DH147-EKD16 DH147-EKD6

DH144-EKD3 DH148-EKD22

DH147-EKD18 DH145-EKD10

Perkinsus marinus Crypthecodinium cohnii

Pyrocystis noctiluca Noctiluca scintillans

DH147-EKD21 Gymnodinium mikimotoi

Symbiodinium microadriaticum DH147-EKD17

Acanthometra sp.- AF063240 Chaunacanthid sp.-218

Heteromita globosa Euglypha rotunda

Blastocystis hominis Cafeteria roenbergensis

Ulkenia profunda Labyrinthuloides haliotidis

DH147-EKD10 Labyrinthuloides minuta

Schizochytrium minutum Thraustochytrium multirudimentale

DH148-5-EKD53 Developayella elegans

DH144-EKD10 Lagenidium giganteum

Hyphochytrium catenoides

Fragilaria striatula DH148-5-EKD54

Pseudo-nitzschia pungens Pelagomonas calceolata

Giraudyopsis stellifera

Vacuolaria virescens Nannochloropsis salina

Fucus gardneri

Cyanophora paradoxa

Hydra littoralis Saccharomyces cerevisiae

DH148-5-EKD21 Eupenicillium javanicum

100

50

12 84

42 89 39 38

100

70 69

48

99

100

29

57 19

100

100

Apicomplexa

Chlorarachniophyta

HaptophyceaeGreen plantsCryptophyta

AcanthamoebidaeRed algae

Chrysophyceae

DictyochophyceaeBolidophyceae

Centric diatoms

Metazoa

Fungi

Ciliates

Dinoflagellates

Pennate diatoms

Alveolates

Marine alveolateGroup II

Marine alveolateGroup I

Acantharea

ThraustochytridsLabyrinthulids

&

Heterokonts

a

b

48

96

80

100

100

100

43

!"#$%& ' !"#$%&%'($)*($+,,- ./** ,0 *&)"/1,.$2 3+1(,.13*4 $5 -**3 65."/2.$2 7".*/42,54./&2.*- &4$58 9:9 *&)"/1,.$2 9;< /=>6 4*?&*52*4@ A+* ./** +"4 B**5 43($. $5 .7,

3"/.4 /*3/*4*5.$58 .+* B"4"( 3"/. C(D "5- .+* 2/,75 C)D ,0 .+* *&)"/1,.$2 /=>6'B"4*-3+1(,8*51@ A+* ,&.8/,&3 B/"52+ C"/2+"*"D $4 5,. 4+,75@ A+$5 ./$"58(*4 2,//*43,5- ., .7,

/*3/*4*5.".$E* 43*2$*4 ,0 " 8$E*5 ."#,5F .+/** $5 .+* 2"4* ,0 63$2,%3(*#"@ G,,.4./"3

E"(&*4 "/* 8$E*5 ,5(1 B*(,7 5,-*4 2,52*/5$58 .+* 5*7 *&)"/1,.$2 4*?&*52*4@ A+* 2,(,&/

2,-* $5-$2".*4 4*" -*3.+4 ". 7+$2+ 4*?&*52*4 7*/* ,B."$5*-@ <2"(* B"/4 2,//*43,5- .,

9H 4&B4.$.&.$,54 3*/ 9:: 3,4$.$,54 0,/ " &5$. B/"52+ (*58.+@

© 2001 Macmillan Magazines Ltd

Drake Passage

David MoreiraPurificacion

Lopez-Garcia

SSU rRNA

Page 15: Denis BAURAIN

Phylum-level Findings (1)body stainable with the nucleic acid–specificdye DAPI (Fig. 1) may be a DNA-containingnucleomorph, similar to that found in crypto-

phytes and chlorarachniophytes (17), support-ing the idea that picobiliphytes are anothersecondary endosymbiotic algal group (18).

Kleptoplastidy is another possibility, such asin the katablepharids (19, 20), which along withthe cryptophytes are the picobiliphytes’ pur-ported sister group. However, kleptoplastidy isunlikely in such small organisms. In the absenceof living cells to follow through cell division, wescreened filtered 3-mm–fractioned water forcells that hybridized with our probes, using aChemScan solid-phase cytometer (8) (fig. S2).We never encountered positive cells without aplastid on the filters scanned by the laser,which implies that the cells are predominatelypigmented, so kleptoplastidy does not seemvery likely.

Are the picobiliphytes representatives ofanother red algal secondary endosymbiosis,such as chromo-alveolates, in the broad sense,or do they have kleptoplastids? Without livingcells, the status of their endosymbiosis and aformal description will remain unresolved.Nevertheless, picobiliphytes are pigmented andthus contribute to primary production. Molecu-lar analysis confirms that they are a eukaryoticgroup that should be recognized at the phylumor division level, without any real indication oftheir sister group. We found that they are wellrepresented in polar and cold temperate coastalmarine ecosystems, as judged from their ap-pearance in clone libraries and preliminary FISHdata. The putative presence of a DNA-containingbody in the purported plastid places them in anintriguing position in the study of plastid re-duction to organelles.

Within the past 15 years, four algal classeshave been described from the picoplankton [see(5) for details], and picobiliphytes representanother division or phylum. The phylogeneticanalysis indicates that they are a highly diversegroup, composed of at least three distinct clades.The temporal and spatial scales at which theyoccur, as inferred from molecular data, indicatethat they could make up a substantial picoplank-ton fraction under certain conditions. The ex-istence of small, sometimes rare, organisms isonly now being recognized, and their role inecosystem function is unknown, but they prob-ably act as reservoirs of genetic capacity that areactivated under specific conditions. The discov-ery of picobiliphytes and their apparent wide-spread distribution and contribution to marineprotist assemblages highlight the imperative ofunderstanding biodiversity before its loss on aglobal scale.

References and Notes1. B. Díez, C. Pedrós-Alió, R. Massana, Appl. Environ.

Microbiol. 67, 2932 (2001).2. P. López-García, F. Rodríguez-Valera, C. Pedrós-Alió,

D. Moreira, Nature 409, 603 (2001).3. S. Y. Moon-van der Stay, R. De Wachter, D. Vaulot,

Nature 409, 607 (2001).4. K. Romari, D. Vaulot, Limnol. Oceanogr. 49, 784 (2004).5. L. K. Medlin et al., Microb. Ecol. 167, 1432 (2006).6. R. Massana, V. Balagué, L. Guillou, C. Pedros-Alió,

FEMS Microbiol. Ecol. 50, 231 (2004).7. C. Lovejoy, R. Massana, C. Pedrós-Alió, Appl. Environ.

Microbiol. 72, 3085 (2006).8. See supporting material on Science Online.

Fig. 1. Phylogenetic trees were reconstructed from full-length 18S rRNA sequence data listed in tableS1 and inferred with Bayesian analysis from two parallel runs, each with one million generations with sixchains and increased temperature between the chains to facilitate exchange between the chains (8). Thistree is the 50% majority-rule tree of the last 100 trees saved from one of the parallel runs. Support foreach node was also determined with 100 replicated bootstrap analyses of weighted maximumparsimony and neighbor-joining analyses. Nodes supported by bootstrap or posterior probability valuesabove 50% are labeled for the three methods used (MrBayes/maximum parsimony/neighbor-joining). Ifa clade was not supported by a method, it is indicated by a dash. The asterisk indicates that internalmajor clades were supported by 100 posterior probabilities from the MrBayes analysis. PICOBI01 andPICOBI02 are specific for the sequences belonging to the three clades as bracketed. (Insert) Picture of acell targeted by the probe PICOBI02 (specific for picobiliphyte clade 2) from the Roscoff ASTANsampling site on 26 September 2001. Arrows point to the DAPI-stained nucleus (nuc) in blue, to thegreen fluorescence from probe-specific labeling of the small subunit rRNA in the cytoplasm (cyto), andto the red autofluorescence from the phycobiliprotein-containing organelle (PBPorg). Double asterisksindicate sequences not recognized by the probes.

12 JANUARY 2007 VOL 315 SCIENCE www.sciencemag.org254

REPORTS

on

Janu

ary

13, 2

007

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fro

m

Not et al. (2007) Science 315:253-255; Yoon et al. (2011) Science 332:714-717

SSU rRNA,FISH, SCG

Single-Cell Genomics RevealsOrganismal Interactions inUncultivated Marine ProtistsHwan Su Yoon,1,2* Dana C. Price,3* Ramunas Stepanauskas,1 Veeran D. Rajah,3 Michael E. Sieracki,1

William H. Wilson,1 Eun Chan Yang,1 Siobain Duffy,3 Debashish Bhattacharya3†

Whole-genome shotgun sequence data from three individual cells isolated from seawater,followed by analysis of ribosomal DNA, indicated that the cells represented three divergent cladesof picobiliphytes. In contrast with the recent description of this phylum, we found no evidence of plastidDNA nor of nuclear-encoded plastid-targeted proteins, which suggests that these picobiliphytesare heterotrophs. Genome data from one cell were dominated by sequences from a widespreadsingle-stranded DNA virus. This virus was absent from the other two cells, both of which containednon-eukaryote DNA derived from marine Bacteroidetes and large DNA viruses. By using shotgunsequencing of uncultured marine picobiliphytes, we revealed the distinct interactions of individual cells.

Culture-independent analyses of environ-mental ribosomal DNA (rDNA) clone li-braries and metagenomes can uncover

unexpected microbial species and gene diversity(e.g., 1–3). These methods cannot, however, re-

veal in situ interactions among organisms. Toachieve this level of resolution, genome data fromsingle cells captured from the wild environmentare needed. We used single-cell genomics (4–7)to study themarine plankton group Picobiliphyta,

recently described as a previously unknown lineageof pigmented eukaryotes with a phylogenetic af-finity to cryptophytes and katablepharids (8, 9). Thecellswere originally identifiedmicroscopicallywiththe use of 18S rDNA-based fluorescent in situhybridization probes. Although their ultrastructureis unknown, previous studies using autofluores-cence and 4´,6-diamidino-2-phenylindole stainingdata (9, 10) appeared to show that picobiliphytescontain a plastid derived from a cryptophyte alga(owing to the presence of phycobilin proteins;hence the phylum name) and the associated rem-nant nucleus (nucleomorph). These taxa havenot yet been successfully cultivated, leaving openthe possibility that the plastid and nucleomorphmay not be permanent acquisitions but rathercome from a klepto-plastid or a cryptophyte alga

1Bigelow Laboratory for Ocean Sciences, West Boothbay Har-bor, ME 04575, USA. 2Department of Biological Sciences,Sungkyunkwan University, Suwon 440-746, South Korea. 3De-partment of Ecology, Evolution and Natural Resources, RutgersUniversity, New Brunswick, NJ 08901, USA.

*These authors contributed equally to this work.†To whom correspondence should be addressed. E-mail:[email protected]

All gene hitsUnique gene hits

MS584-22 Contigs with hits

MS584-11 Contigs with hits

MS584-5 Contigs with hits

//

B C

D

A

MS584-51804 nt

putativ

erep

licat

ion-

asso

cia

tedprotein

100200

300400

500600

700800900100011

0012

0013

0014

0015

00

1600 1700

unknown

function

93

100

62

70

66

63

69

62

95

100

80

96

100100

0.2 substitutions/sites

circovirusescycloviruses

nanoviruses

oceanmetagenome

oceanmetagenomecryptophytes

DQ222873 North Sea

DQ222876 English Channel

DQ222877 English Channel

DQ222878 English Channel

EU368039 North Atlantic Slope

DQ060525 Arctic Ocean

Boothbay MS609-66DQ060524 Arctic Ocean

DQ222872 North Sea

EU368038 North Atlantic Slope

EU368037 North Atlantic Slope

DQ222879 English Channel

DQ222880 English Channel

DQ222875 North Atlantic

DQ060526 Arctic Ocean

AY426835 Mediterranean coast

DQ222874 North Sea

AB240953 Cryptomonas rostratiformis

AB240952 Cryptomonas ovata

L28811 Chilomonas paramecium

AB240955 Cryptomonas paramecium

AB241128 Rhodomonas balticaU03072 Goniomonastruncata

AB194980 Leucocryptos marina

Environmental sample POVAMDc39

AY919672 Katablepharis remigera

AB231617 Katablepharis japonica

X70803 Glaucocystis nostochinearum

X81901 Gloeochaete wittrockiana

AY823716 Cyanophora paradoxa

0.1 substitutions/site

100

60

97

100

katablepharids

glaucophytes

100

100

100100

100

100100

100100

100100

100

100100

98

8487

95

98

100

98

9587

9899

7781

9696

76

69

0 50 100 150

MetazoaProteobacteria

ViridiplantaeOthers

StramenopilesVira

FungiBacteroidetes

HaptophytaAlveolataExcavata

Firmicutes

0 20 40 60 80 100 120

OthersMetazoa

ViridiplantaeVira

StramenopilesProteobacteria

Haptophyta FungiCiliata

894

0 100 200 300 400

Bacteroidetes Metazoa

Viridiplantae Stramenopiles Proteobacteria

Haptophyta Fungi

Excavata Others

Choanoflagellida Alveolata

Firmicutes Rhodophyta Amoebozoa

Vira Fusobacteria Elusimicrobia

1686//

EBA53362 GOS_11576ECU79003 GOS_10979

EBA56617 GOS_7546ECL36795 GOS_3680446

ECU78738 GOS_11246ECU78869 GOS_11113

EBA56731 GOS_7420EBA53737 GOS_9213

ECU79006 GOS_10976EBA56841 GOS_7308

ECU78740 GOS_11241 ECU78741 GOS_11242

EBA57629 GOS_6345

RW_D FJ959080

CB_A FJ959082BBC_A FJ959086

RW_C FJ959079

RW_E FJ959081RW_A FJ959077

RW_B FJ959078

picobiliphyte MS584-5

MS584-5

MS584-22

picobiliphytesMS584-11

Fig. 1. (A) Randomized accelerated maximum likelihood (RAxML) phyloge-netic tree of picobiliphyte SSU rDNA coding regions. RAxML bootstrap valuesare above the branches, and those derived from maximum parsimony (whennodes are shared) are below the branches. Only bootstrap values !60% areshown. Sequenced genomes are in bold. GenBank numbers are shown for eachtaxon. (B) Analysis of the taxonomic distribution of BLASTx hits using as querythe 454-derived contigs from each SAG assembly (when !10; if "10, thedifferent hits were grouped under “Others”). The total number of hits (bluebars) and the unique gene hits (red bars) are shown for MS584-5, MS584-11,

and MS584-22. Some taxa are overrepresented, such as virus hits in MS584-5and Bacteroidetes in MS584-22 that are probably explained by MDA bias. (C)Genome structure of the previously unknown ssDNA virus. (D) SimplifiedRAxML tree of Rep proteins from representative ssDNA viruses, showing thephylogenetic position of the MS584-5 sequence. Rep from marine ssDNAviruses is shown in blue, whereas sequences derived from ocean metagenomedata are shown in red. The bootstrap values (when !60%) above the branchesare from RAxML, whereas those below are from PhyML. The full tree is shownin fig. S2A.

6 MAY 2011 VOL 332 SCIENCE www.sciencemag.org714

REPORTS

on

Febr

uary

16,

201

2w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Page 16: Denis BAURAIN

Phylum-level Findings (2)

cryp

tom

ycot

agr

oup-

spec

ific

prob

esan

da

pane

lof

nega

tive

cont

rols

(det

aile

din

the

Met

hods

)wer

eco

nsis

tent

lyne

gativ

e.T

oco

mpl

emen

tth

isap

proa

ch,

we

also

used

two

LKM

11FI

SHpr

obes

prev

ious

lyus

edin

asso

ciat

ion

with

addi

tiona

leu

kary

otic

prob

esto

dete

rmin

eth

eeuk

aryo

ticco

mm

unity

stru

ctur

ein

fres

hwat

eren

viro

nmen

ts15

.LK

M11

deno

tes

asu

bsec

tion

ofth

een

viro

nmen

tal

sequ

ence

sbra

nchi

ngw

ithin

the

cryp

tom

ycot

acl

ade

dete

cted

inea

rlie

rst

udie

s6–8 .

We

used

the

prob

eLK

M11

-01

(Fig

.1a

,c)

toid

entif

ya

seco

ndsu

bset

ofth

ecr

ypto

myc

ota

clad

ean

dco

nsis

tent

lyre

cove

red

ace

llty

pesi

mila

rto

that

reco

vere

dfo

rCM

1(F

ig.1

d).T

heLK

M11

-02

prob

ew

asne

gativ

e.W

eno

teth

atth

eab

unda

nce

ofce

llsid

entif

ied

byLK

M11

-01

was

atle

astt

enfo

ldhi

gher

than

obse

rved

fort

heC

M1

cells

,su

gges

ting

that

the

LKM

11-0

1su

bcla

dew

asm

ore

abun

dant

than

the

CM

1su

bcla

dein

the

envi

ronm

ents

sam

pled

.T

oex

amin

efu

rthe

rthe

life

cycl

ean

dm

orph

olog

yof

the

targ

etgr

oups

inth

efr

eshw

ater

sam

ples

,w

ete

sted

whe

ther

the

cells

iden

tifie

dby

Phy

sode

rma

may

dis

Nuc

lear

ia s

impl

ex

Ento

phly

ctis

hel

iofo

rmis

Bas

idio

bolu

s ra

naru

m Bla

stoc

ladi

ella

em

erso

nii

Hya

lora

phid

ium

cur

vatu

m

Bat

rach

ochy

triu

m d

endr

obat

idis

Cyl

lam

yces

abe

rens

is

Mon

oble

phar

ella

sp.

Chy

trio

myc

es h

yalin

usR

hizo

clos

mat

ium

sp.

Rhi

zoph

ydiu

m s

phae

roth

ecaA

llom

yces

arb

uscu

la

Oed

ogon

iom

yces

sp.

Par

aglo

mus

occ

ultu

m

Rhi

zoph

lyct

is ro

sea

Cat

enom

yces

sp.

Pol

ychy

triu

m a

ggre

gatu

m

Glo

mus

mos

seae

Trip

artic

alca

r arc

ticum

Sm

ittiu

m c

ulis

etae

Pow

ello

myc

es s

p.

Kar

lingi

omyc

es s

p.

Dot

hide

a sa

mbu

ci

Sch

izos

acch

arom

yces

pom

be

Dis

ciot

is v

enos

a

Rhi

zoph

ydiu

m b

rook

sian

um

Rhi

zopu

s or

yzae

Deb

aryo

myc

es h

anse

nii

Can

dida

gla

brat

a

Ust

ilago

may

dis

Mag

napo

rthe

gris

eaN

euro

spor

a cr

assa

Cop

rinop

sis

cine

rea

Mon

ascu

s pu

rpur

eus

Cal

ocer

a co

rnea

Roz

ella

allo

myc

is

Phy

com

yces

bla

kesl

eean

us

Filo

basi

diel

la n

eofo

rman

s

Mon

ilini

a fr

uctic

ola

Roz

ella

sp.

Hyp

ocre

a je

corin

a

Zeuk

2Was

hing

ton

Sin

ger

CM

1

WIM

27

kor_

1109

04_2

4

CV1

_B2_

34

RS

C-C

HU

-59

I34

RS

C-C

HU

-43

PFB

3AU

2004

NA

MA

KO

-36

Am

b_18

S_3

97

PFB

7SP

2005

LKM

46P

1-3m

12

PFD

5AU

2004

ww

euk6

RS

C-C

HU

-42

BA

QA

64

BA

QA

04

PFA

12A

U20

04

Elev

_18S

_792

Lily

Ste

m C

M2

LKM

11

PG

5.28

P2-

3m4

BA

QA

254

P2-

3m3

G40

P34

.42

Am

b_18

S_4

05

DS

GM

-64LK

M15

P2-

3m5

dpeu

k6

Am

b_18

S_1

059

TAG

IRI-

23

P1-

3m9

P4-

3m5

PFB

12A

U20

04

P1-

3m3

RT5

iin3

Elev

_18S

_791

Join

v23

PR

S2_

4E_0

6

NA

MA

KO

-37

G5.

3P

6-3m

3C

H1_

S2_

50P

robe

CM

1.1

Pro

beC

M1.

2

Cap

sasp

ora

owcz

arza

kiA

moe

bidi

um p

aras

iticu

mS

uber

ites

ficus A

urel

ia s

p.

Sac

char

omyc

es c

erev

isia

e

Pro

be

LKM

11-0

1

Endo

chyt

rium

sp.

N

owak

owsk

iella

sp.

Cla

doch

ytriu

m r

eplic

atum

Asc

omyc

ota

Roz

ella

Bas

idio

myc

ota

Zygo

myc

ota

Glo

mer

omyc

ota

Bla

stoc

ladi

omyc

ota

Opi

stho

kont

out

grou

p

Bla

stoc

ladi

omyc

ota

Chy

trid

iale

s

Chy

trid

iale

s

Zygo

myc

ota

Glo

mer

omyc

ota

Mon

oble

phar

idal

es

Spi

zello

myc

etal

es

1/66

/36

1/96

/96

SS

U5.

8SLS

U

CM

1 pr

imer

Was

hing

ton

Sin

ger C

M1

28S

r1A

U4

1520

r

AU

2

28S

r1

V9

varia

ble

reg

ion

CM

2 pr

imer

CM

1.1

prob

e Lily

Ste

m C

M2

Was

hing

ton

Sin

ger C

M2

CM

1.2

prob

e

LKM

11-0

1 gr

oup

LKM

11-0

1 pr

obe

0.6/

50/5

00.

9/80

/80

0.2

subs

titut

ions

per

site

CM

2

–1/2

CM1.1 CM1.2 LKM11-01

DAP

I

TSA

-FIS

H

Ove

rlay

Pla

tygl

oea

disc

iform

is

Chy

trid

iale

s

Neo

calli

mas

tigal

es

Zygo

myc

ota

Chy

trid

iale

s

–1/2

–1/2

Tille

tiaria

ano

mal

a

–1/2

c dba

Cryptomycota

Mar

ine

sedi

men

ts, l

ow ti

de, B

erke

ley

Aqu

atic

Par

k, U

SA

Sul

phid

e-ric

h sp

ring,

Okl

ahom

a, U

SA

Pol

lute

d aq

uife

r sed

imen

t, B

anis

veld

land

!ll,

The

Net

herla

nds

Fres

hwat

er a

naer

obic

pon

d, O

klah

oma,

US

ATh

e R

io T

into

, pH

2 ri

ver,

Spa

inA

naer

obic

was

te-w

ater

trea

tmen

t pla

nt, O

klah

oma,

US

AC

ontin

uous

cul

ture

inoc

ulat

ed w

ith la

ke w

ater

, The

Net

herla

nds

Lake

Kor

onia

, wat

er c

olum

n, G

reec

eA

naer

obic

aqu

ifer p

ollu

ted

with

land

!ll

Fres

hwat

er, L

ake

Pav

in, F

ranc

eFr

eshw

ater

lake

s, F

ranc

eFr

eshw

ater

cla

y/sa

nd s

edim

ent,

Fran

ceLa

ke H

uron

sed

imen

t sam

ple,

US

AC

hlor

inat

ed d

rinki

ng w

ater

, Fra

nce

Pea

t bog

, Sw

itzer

land

Trem

blin

g as

pen

rhiz

osph

ere,

am

bien

t CO

2 co

nditi

ons,

US

ATr

embl

ing

aspe

n rh

izos

pher

e, e

leva

ted

CO

2 co

nditi

ons,

US

AR

hizo

sphe

re s

oil o

f "ow

erin

g m

aize

pla

nts,

Ger

man

y19

75 n

on-f

ertil

ized

agr

icul

tura

l soi

l sam

ple,

The

Net

herla

nds

Eutr

ophi

c fre

shw

ater

, UK

(thi

s st

udy)

Ano

xic

sedi

men

t, su

bmar

ine

cald

era

"oor

, Jap

anU

pper

mar

ine

met

hane

col

d se

ep s

edim

ent,

Japa

nA

noxi

c se

dim

ent,

Lake

Nam

ako-

ike,

Jap

an

Figu

re1

|Ide

ntif

icat

ion

ofth

ecr

ypto

myc

ota.

a,Ph

ylog

eny

dem

onst

rate

sa

dive

rse

clad

eof

envi

ronm

enta

lseq

uenc

es(c

olou

red

dots

)bra

nchi

ngat

the

base

ofth

eFu

ngi.

MrB

ayes

tree

topo

logy

was

calc

ulat

edfr

oman

alig

nmen

tof1

00se

quen

cesa

nd1,

012

DN

Ach

arac

ters

.Sup

port

valu

esar

esu

mm

ariz

edby

blac

kdo

ts(i

ndic

atin

gat

leas

t0.9

Bay

esia

npo

ster

ior

prob

abili

tyan

d80

%bo

otst

rap

supp

ortb

yM

Lan

dLo

g-D

etdi

stan

cem

etho

ds)

orby

blac

kri

ngs

(Bay

esia

npo

ster

ior

prob

abili

tyab

ove

0.6

and

boot

stra

psu

ppor

tabo

ve50

%).

The

key

node

show

ing

mon

ophy

lyof

cryp

tom

ycot

aen

com

pass

ing

Roz

ella

ism

arke

d

with

actu

alva

lues

,with

the

SSU

–5.8

S–LS

Usu

ppor

tval

ue(S

uppl

emen

tary

Fig.

2a)

show

nin

red.

Shor

tene

dbr

anch

esar

ein

grey

.Seq

uenc

esta

rget

edby

the

prob

es/p

rim

ers

are

labe

lled

onth

etr

ee.b

,Pro

vena

nce

listo

fenv

iron

men

tal

DN

Ase

quen

ces.

c,Sc

hem

atic

repr

esen

tatio

nof

the

posi

tions

ofpr

imer

s(a

rrow

s)an

dpr

obes

(gre

enhe

xago

ns)

ona

part

ialr

DN

Age

near

ray

(gre

y).

d,T

SA-F

ISH

iden

tific

atio

nof

cryp

tom

ycot

ace

lls(g

reen

)and

DA

PIst

aini

ngof

mic

robi

alco

mm

unity

(blu

e).S

cale

bar,

10m

m.

RESE

ARCH

LETT

ER

2|

NA

TU

RE

|V

OL

00

0|

00

MO

NT

H2

01

1

Mac

mill

an P

ublis

hers

Lim

ited.

All

right

s re

serv

ed©2011

Jones et al. (2011) Nature 474:200-203

SSU rRNA

Page 17: Denis BAURAIN

3Quelles sont les relationsentre les groupes d’Eucaryotes ?

Page 18: Denis BAURAIN

Archezoa vs. ‘Crown’ groups

© 2006 Nature Publishing Group

Eukaryotic evolution, changes andchallengesT. Martin Embley1 & William Martin2

The idea that some eukaryotes primitively lacked mitochondria and were true intermediates in the prokaryote-to-eukaryote transition was an exciting prospect. It spawned major advances in understanding anaerobic and parasiticeukaryotes and those with previously overlooked mitochondria. But the evolutionary gap between prokaryotes andeukaryotes is now deeper, and the nature of the host that acquired the mitochondrion more obscure, than ever before.

New findings have profoundly changed the ways in which weview early eukaryotic evolution, the composition of majorgroups, and the relationships among them. The changeshave been driven by a flood of sequence data combined with

improved—but by nomeans consummate—computational methodsof phylogenetic inference. Various lineages of oxygen-shunning orparasitic eukaryotes were once thought to lack mitochondria andto have diverged before the mitochondrial endosymbiotic event.Such key lineages, which are salient to traditional concepts abouteukaryote evolution, include the diplomonads (for example,Giardia),trichomonads (for example, Trichomonas) and microsporidia (forexample, Vairimorpha). From today’s perspective, many key groupshave been regrouped in unexpected ways, and aerobic and anaerobiceukaryotes intermingle throughout the unfolding tree.Mitochondriain previously unknown biochemical manifestations seem to beuniversal among eukaryotes, modifying our views about the natureof the earliest eukaryotic cells and testifying to the importance ofendosymbiosis in eukaryotic evolution. These advances have freedthe field to consider new hypotheses for eukaryogenesis and to weighthese, and earlier theories, against the molecular record preserved ingenomes. Newer findings even call into question the very notion of a‘tree’ as an adequate metaphor to describe the relationships amonggenomes. Placing eukaryotic evolution within a time frame andancient ecological context is still problematic owing to the vagariesof the molecular clock and the paucity of Proterozoic fossil eukaryotesthat can be clearly assigned to contemporary groups. Although thebroader contours of the eukaryote phylogenetic tree are emergingfrom genomic studies, the details of its deepest branches, and its root,remain uncertain.

The universal tree and early-branching eukaryotic lineagesThe universal tree based on small-subunit (SSU) ribosomal RNA1

provided a first overarching view of the relationships between thedifferent types of cellular life. The relationships among eukaryotesrecovered from rRNA2, backed up by trees of translation elongationfactor (EF) proteins3, provided what seemed to be a consistent, andhence compelling, picture (Fig. 1). The three protozoa at the base ofthese trees (Giardia, Trichomonas and Vairimorpha), along withEntamoeba and its relatives, were seen as members of an ultrastruc-turally simple, paraphyletic group of eukaryotes called the Archezoa4.Archezoawere thought to primitively lackmitochondria, having splitfrom the main trunk of the eukaryotic tree before the mitochondrialendosymbiosis: all other eukaryotes contain mitochondria because

they diverged after this singular symbiotic event5. Therefore, Archezoawere interpreted as contemporary descendants of a phagotrophic,nucleated, amitochondriate cell lineage that included the host for themitochondrial endosymbiont6. The apparent agreement betweenmolecules and morphology depicted the relative timing of themitochondrial endosymbiosis (Fig. 1) as a crucial, but not ancestral,event in eukaryote phylogeny.

Chinks in the consensusMitochondrial genomes studied so far encode less than 70 of theproteins that mitochondria need to function5; most mitochondrialproteins are encoded by the nuclear genome and are targeted to

REVIEWS

Figure 1 | The general outline of eukaryote evolution provided by rootedrRNA trees. The tree has been redrawn and modified from ref. 92. Untilrecently, lineages branching near the root were thought to primitively lackmitochondria and were termed Archezoa4. Exactly which archezoansbranched first is not clearly resolved by rRNA data2, hence the polytomy(more than two branches from the same node) involving diplomonads,parabasalids and microsporidia at the root. Plastid-bearing lineages areindicated in colours approximating their respective pigmentation. Lineagesfurthest away from the root, including those with multicellularity, werethought to be the latest-branching forms and were sometimes misleadingly(see ref. 60) called the ‘crown’ groups.

1School of Biology, The Devonshire Building, University of Newcastle upon Tyne, Newcastle NE1 7RU, UK. 2Institute of Botany III, University of Dusseldorf, D-40225 Dusseldorf,Germany.

Vol 440|30 March 2006|doi:10.1038/nature04546

623

Cavalier-Smith (1993) Microbiol Rev 57:953-994; Embley & Martin (2006) Nature 440:623-630

SSU rRNA & elongation factors

Tom Cavalier-Smith

~1995

Page 19: Denis BAURAIN

What are the Archezoa?

Giardia Trichomonas

Encephalitozoon

Giardia Trichomonas

Encephalitozoon

parasitesamitochondriaux

(ultrastructure simple,pas de peroxysomes,

système endomembranaire peu développé)

Page 20: Denis BAURAIN

Not so Archezoa (1)

S150 The American Naturalist

Figure 3: Phylogenies of chaperonin 60 and heat shock protein 70 genes from amitochondriate protists. Trees were obtained from protein maximumlikelihood (PROTML; Adachi and Hasegawa 1996) analysis of aligned cpn60 (513 positions) (A) and hsp70 (515 positions) (B) data sets. Phylogeneticmethodology is described in Roger et al. (1998). Amitochondriate lineages are shown in bold.

could be highly modified mitochondria (Cavalier-Smith1987).

When data for hydrogenosomal proteins became avail-able from T. vaginalis, it was shown that they had mito-chondrial-like N-terminal targeting peptides that werecleaved upon import into the hydrogenosome (Johnsonet al. 1993). However, stronger evidence came with the

report of cpn60, cpn10, and hsp70 genes in T. vaginalis(Bui et al. 1996; Germot et al. 1996; Horner et al. 1996;Roger et al. 1996). Once again, phylogenetic analyses in-dicated that these genes were clearly related to mitochon-drial isoforms (fig. 3). Furthermore, the cpn60 and hsp70proteins appear to localize to hydrogenosomes in this or-ganism (Bui et al. 1996; Bozner 1997), where they may

Roger (1999) Am Nat 154:146-163

S150 The American Naturalist

Figure 3: Phylogenies of chaperonin 60 and heat shock protein 70 genes from amitochondriate protists. Trees were obtained from protein maximumlikelihood (PROTML; Adachi and Hasegawa 1996) analysis of aligned cpn60 (513 positions) (A) and hsp70 (515 positions) (B) data sets. Phylogeneticmethodology is described in Roger et al. (1998). Amitochondriate lineages are shown in bold.

could be highly modified mitochondria (Cavalier-Smith1987).

When data for hydrogenosomal proteins became avail-able from T. vaginalis, it was shown that they had mito-chondrial-like N-terminal targeting peptides that werecleaved upon import into the hydrogenosome (Johnsonet al. 1993). However, stronger evidence came with the

report of cpn60, cpn10, and hsp70 genes in T. vaginalis(Bui et al. 1996; Germot et al. 1996; Horner et al. 1996;Roger et al. 1996). Once again, phylogenetic analyses in-dicated that these genes were clearly related to mitochon-drial isoforms (fig. 3). Furthermore, the cpn60 and hsp70proteins appear to localize to hydrogenosomes in this or-ganism (Bui et al. 1996; Bozner 1997), where they may

CPN60 HSP70

présence de gènes nucléaires d’origine mitochondriale dans les Archezoa

Andrew Roger

Page 21: Denis BAURAIN

Not so Archezoa (2)

© 2006 Nature Publishing Group

mitochondria using a protein import machinery that is specific tothis organelle7. The mitochondrial endosymbiont is thought to havebelonged to the a-proteobacteria, because some genes and proteinsstill encoded by the mitochondrial genome branch in moleculartrees among homologues from this group5,8. Some mitochondrialproteins, such as the 60- and 70-kDa heat shock proteins (Hsp60,Hsp70), also branch among a-proteobacterial homologues, but thegenes are encoded by the host nuclear genome. This is readilyexplained by a corollary to endosymbiotic theory called endosym-biotic gene transfer9: during the course of mitochondrial genomereduction, genes were transferred from the endosymbiont’s genometo the host’s chromosomes, but the encoded proteins were re-imported into the organelle where they originally functioned. Withthe caveat that gene origin and protein localization do not alwayscorrespond9, any nuclear-encoded protein that functions in mito-chondria and clusters with a-proteobacterial homologues is mostsimply explained as originating from the mitochondrion in thismanner.By that reasoning10, the discovery of mitochondrial Hsp60 in

E. histolytica was taken as evidence that its ancestors harbouredmitochondria. A flood of similar reports on mitochondrial Hsp60and Hsp70 from all key groups of Archezoa ensued11, suggesting that

their common ancestor also contained mitochondria. At face value,those findings falsified the central prediction of the archezoanconcept. However, suggestions were offered that lateral gene transfer(LGT) in a context not involving mitochondria could also accountfor the data. But that explanation, apart from being convoluted, nowseems unnecessary: the organisms once named Archezoa for lack ofmitochondria not only have mitochondrial-derived proteins, theyhave the corresponding double-membrane-bounded organelles aswell.

Mitochondria in multiple guisesThe former archezoans aremostly anaerobes, avoiding all but a trace ofoxygen, and like many anaerobes, including various ciliates and fungithat were never grouped within the Archezoa, they are now known toharbour derived mitochondrial organelles—hydrogenosomes andmitosomes. These organelles all share one or more traits in commonwith mitochondria (Fig. 2), but no traits common to them all, apartfrom the double membrane and conserved mechanisms of proteinimport, have been identified so far. Mitochondria typically—butnot always (the Cryptosporidium mitochondrion lacks DNA12)—possess a genome that encodes components involved in oxidativephosphorylation5.With one notable exception13, all hydrogenosomes

Figure 2 | Enzymes and pathways found in various manifestations ofmitochondria. Proteins sharing more sequence similarity to eubacterialthan to archaebacterial homologues are shaded blue; those with conversesimilarity pattern are shaded red; those whose presence is based only onbiochemical evidence are shaded grey; those lacking clearly homologouscounterparts in prokaryotes are shaded green. a, Schematic summary ofsalient biochemical functions in mitochondria5,88, including some anaerobicforms16,17. b, Schematic summary of salient biochemical functions inhydrogenosomes14,19. c, Schematic summary of available findings formitosomes and ‘remnant’ mitochondria32–34,93. The asterisk next to theTrachipleistophora and Cryptosporidium mitosomes denotes that theseorganisms are not anaerobes in the sense that they do not inhabit O2-poor

niches, but that their ATP supply is apparently O2-independent. UQ,ubiquinone; CI, mitochondrial complex I (and II, III and IV, respectively);NAD, nicotinamide adenine dinucleotide; MCF, mitochondrial carrierfamily protein transporting ADP and ATP; STK, succinate thiokinase;PFO, pyruvate:ferredoxin oxidoreductase; PDH, pyruvate dehydrogenase;CoA, coenzyme A; Fd, ferredoxin; HDR, iron-only hydrogenase;PFL, pyruvate:formate lyase; ASC, acetate-succinate CoA transferase;ADHE, bi-functional alcohol acetaldehyde dehydrogenase; FRD, fumaratereductase; RQ, rhodoquinone; Hsp, heat shock protein; IscU, iron–sulphurcluster assembly scaffold protein; IscS; cysteine desulphurase; ACS (ADP),acetyl-CoA synthase (ADP-forming).

REVIEWS NATURE|Vol 440|30 March 2006

624

Embley & Martin (2006) Nature 440:623-630

présence d’organelles homologues aux mitochondries dans les Archezoa

hydrogenosomes mitosomes mitochondries

Page 22: Denis BAURAIN

animal, plant, and some protist RPB1 protein sequences (36,37) but are absent from the protists Trichomonas, Trypano-soma, and Giardia (38) as well as from the red algae Bonne-maisonia and Porphyra (39). Thus, the presence of these CTDrepeats is consistent with the inferred fungal relationship (seebelow). The CTD has been implicated in the processing ofspliceosomal introns from pre-mRNA (40), so it is interestingthat spliceosomal snRNAs have now been reported from bothV. necatrix and N. locustae (ref. 18 and references therein), anda spliceosomal intron has recently been discovered in themicrosporidian Encephalitozoon cuniculi (41). With the excep-tion of red algae, the known presence of spliceosomal intronsin eukaryotes (ref. 42; J.M.L., unpublished data) is perfectlycorrelated with the possession of CTD heptapeptide repeats(37, 39).

RPBI Protein and DNA Sequences Support a RelationshipBetween Microsporidia and Fungi. The protein ML tree (Fig.1) placed the two Microsporidia together with the fungal RPB1sequences (M ! F) with strong bootstrap support in allanalyses. Use of the Kishino–Hasegawa (29) test to comparethe statistical significance of different trees also supports M !F. In fact, the one tree we found that could not be rejected atthe 0.05 level and that placed Microsporidia early also placedFungi as the next deepest branch. Such a deep position forFungi conflicts with analyses of several proteins (11) andSSUrRNA (6) that support a Fungi ! Metazoa relationship.Indeed, we also recovered a Fungi (! Microsporidia) rela-tionship with Metazoa in our protein ML tree, albeit with weaksupport. When Fungi are constrained with Metazoa (as mostdata would have them), trees in which Microsporidia branchedbefore Trichomonas, Giardia, or Trypanosoma could all berejected at the 0.05 level.

Reduction of rate heterogeneity between sites did notreduce support for M ! F with protein ML (Fig. 1; data notshown), whereas removal of fast-evolving sites dramaticallyincreased bootstrap support from 41% to 77% for M ! F inMP analyses. This last result is consistent with the hypothesisthat parsimony is particularly sensitive to long-branch effectscaused by unequal substitution rates (43). Analyses of RPB1DNA sequences also support a relationship between Micro-

sporidia and Fungi. Application of the LogDet transformationto variable sites at coding positions 1 ! 2 produced a tree (notshown) where M ! F was supported with BP 74%, rising to BP93% in the absence of the outgroup.

In summary, a relationship between Microsporidia andFungi is strongly supported by our analyses of the RPB1 datasets, and this relationship cannot be attributed to shared aminoacid or nucleotide biases, mutational saturation, or long-branch effects. Because RPB1 appears robust in supporting M! F, we were interested in how inferences from the elongationfactors, EF-1! and EF-2, which apparently support Micros-poridia-early, would stand up in the face of the same analyses.

Do Elongation Factors Support M ! F or Microsporidia-Early? Our ML analysis of the EF-2 protein sequences gave atree similar to the one recently published for the same data set(4) where the microsporidian Glugea plecoglossi is a longbranch at the base of the eukaryote clade. However, bootstrapsupport (with resampled datasets) for Glugea as first branchwas only 54%, and support was further reduced to 33% wheninvariant sites were removed. This is in striking contrast topublished local bootstrap support (75%) from the same dataset for the basal position of Glugea (4). However, localbootstrap support can only be interpreted as bootstrap prob-abilities of a particular internal branch when the other parts ofthe tree are correct (25), an assumption that is likely not metfor these data.

We surmised that a common amino acid bias with someoutgroups may be influencing the observed deep position ofthe long Glugea branch relative to other eukaryotes, becausethe amino acid frequency tree for the aligned sequencesgrouped Glugea with the outgroup Archaea Sulfolobus andMethanococcus. To investigate whether base-compositionaleffects and!or long-branch attraction was contributing to theobserved deep position for Glugea, we removed the archaealoutgroup sequences. We also removed the category of fastestevolving sites because these are expected to contribute most toany long-branch effect (33). Consistent with our hypothesisthat Glugea is branching deep because of artifact, ML and MPanalyses both recovered M ! F (Fig. 2A), albeit with weakbootstrap support (ML " 29%, MP " 40%).

FIG. 1. Phylogenetic analyses of RPB1. Effects on bootstrap support of ISR or FSR were considered by using protein ML. The tree shown isthe ML consensus tree topology from analysis of 760 aligned positions for 15 RPB1 sequences and 1 outgroup RPA1 sequence. Values reportbootstrap support from ML analyses for all 760 sites (ALL sites), 669 sites (FSR, where the fastest evolving sites common to the ML tree and atree where Microsporidia are at the base of the eukaryotes were removed), and 645 sites (ISR). Where only a single bootstrap value is shown, supportwas 100% in all analyses. The scale bar represents 10% estimated sequence divergence under the JTT-F model.

582 Evolution: Hirt et al. Proc. Natl. Acad. Sci. USA 96 (1999)

Keeling & Doolittle (1996) Mol Biol Evol 13:1297-1305; Hirt et al. (1999) PNAS USA 96:580-585

TUBA

RBP1

Not so Archezoa (3)

Analyses of transitions versus transversions for EF-2 codonpositions 1 or 2 (data not shown) suggest that Microsporidiasequences are potentially saturated. Furthermore, analyses ofbase compositions suggest that shared biases may potentiallyinfluence tree topology. For example, nucleotide frequencytrees clustered Glugea with the outgroup Archaea Sulfolobusand Methanococcus, with which it shares a similar base com-position (interestingly, Giardia and Trichomonas clusteredwith the other outgroup, Halobacterium). We therefore inves-tigated the effect of reducing site rate heterogeneity by usingprogressive CSR and outgroup choice on support for M ! For Microsporidia-early from the EF-2 DNA dataset (Fig. 2C–E).

ML estimates suggested that 93% of constant sites should beremoved from the EF-2 dataset for LogDet analyses if theassumption of the method—that all sites can vary—is not to beviolated (22, 23, 33). Under these conditions, the LogDet treerecovered G. plecoglossi as the sister group to Saccharomyceswith 76% bootstrap support (Fig. 2B). In the absence ofoutgroups, the M ! F relationship was strongly supported inall analyses (Fig. 2C). In the presence of outgroups, there weretwo conflicting positions for Microsporidia detected amongbootstrap partitions: (i) Microsporidia-early when all or mostconstant sites were included and (ii) M ! F, which increasedin support as constant sites were removed (Fig. 2C). Thissuggests that EF-2 support for Microsporidia-early is mainlycaused by a failure to adequately account for site-by-site rate

variation coupled with an outgroup-attraction effect. This lastphenomenon is clearly related to the base composition of theoutgroup. When Halobacterium was used as outgroup, M ! Fwas recovered even when all constant sites were included, andthere was never any strong support for Microsporidia-early(Fig. 2D). The Microsporidia-early hypothesis was stronglysupported only when (i) outgroup taxa, i.e., Sulfolobus (datanot shown) or Methanococcus (Fig. 2E), shared the same basecomposition bias as Glugea and (ii) too many constant siteswere included in the analysis.

Protein ML analyses of the EF-1! data set produced similartrees as previously published (4) with moderate (64%) boot-strap support for Glugea as the first branch. However, pairwisecomparisons for EF-1! codon positions 1 and 3 (data notshown) indicate potential saturation with all comparisonsinvolving Glugea strongly clustered and separated from otheramong-eukaryote comparisons. The pattern for changes atreplacement position 2 is even more extreme where theclustered points for Glugea are outside the range for the othereukaryote and Archaea outgroup comparisons (data notshown). Because all changes at position 2 result in a change ofamino acid, the use of the Glugea sequence for phylogeneticinference is compromised, and trees inferred from this data setmust be considered unreliable. Consistent with the hypothesisthat the Glugea EF-1! sequence is behaving in a manner thatis different from the other eukaryote sequences are observa-tions (44) that it contains many nonconservative amino acid

FIG. 2. Phylogenetic analyses of EF-2. (A) Protein ML tree with outgroups removed and FSR (72 sites). The scale bar represents 10% estimatedsequence divergence under the JTT-F model. (B–E) Phylogenetic analyses of EF-2 DNA codon positions 1 ! 2 using LogDet and investigatingthe influence of constant sites and choice of outgroup on bootstrap support. The DNA alignment was derived from a 542-aa alignment. (B) Bootstrapconsensus tree topology for LogDet distances estimated from 812 variable sites (93% of constant sites removed as invariant) indicating supportfor M ! F. The effect of incremental removal of observed constant sites on bootstrap support for different relationships in the presence or absenceof all 3 outgroups (C), with Halobacterium halobium as outgroup (D), and with Methanococcus vannielii as outgroup (E). The dashed vertical linesrepresent the ML estimates of invariant sites: 93%, 92%, and 93% of constant sites for C, D, and E, respectively (see text for discussion).

Evolution: Hirt et al. Proc. Natl. Acad. Sci. USA 96 (1999) 583

EF-2

microsporidies = champignons dérivés

Page 23: Denis BAURAIN

Long Branch AttractionY

XW

Z

p

q q p < q2

Y

XW

Z

si Y est un outgroup phylogénétiquement éloigné

Y

X

W

Z

Y

X

W

Z

Felsenstein (1978) Syst Zool 27:401-410; Philippe et al. (2000) Curr Opin Genet Dev 10:596-601

position artéfactuelle des Archezoa dans

l’arbre de C. Woese

surtout en parcimonie, mais

pas seulement

Joe Felsenstein

Page 24: Denis BAURAIN

The Six Supergroups

multicellularity and differentiation,derived independently of animals,plants and fungi. Myxogastridslime moulds, such as Physarum,form giant multinucleate super-cells (plasmodia) which arecommonly found in terrestrialecosystems.

Evidence that Amoebozoa is amonophyletic group has emergedonly recently. Traditionalclassifications grouped diverseamoeboid organisms together; thisunited most or all Amoebozoa, butgrouped them also with unrelatedforms such as Radiolaria,Foraminifera and ‘heliozoa’ (seebelow). By contrast, earlyphylogenetic studies of rRNAsequences suggested the variousAmoebozoa are independentgroups. This now appears to havebeen an artefact of the simplisticanalytical methods available at thetime — improved analyses andsampling of more amoebae tend toshow that these organisms arerelated, as do analyses ofindividual protein sequences(especially actin) and sophisticatedstudies of multiple proteins.

Plantae‘Primary endosymbiosis’ describesthe origin of a eukaryotic organelleby the engulfment, enslavementand genomic reduction of aprokaryotic cell. Threephotosynthetic groups haveplastids (chloroplasts) thatoriginated by primaryendosymbiosis: land plants

(embryophytes) and green algaesuch as Chlamydomonas; redalgae (rhodophytes); and anobscure group called theglaucophytes. Phylogenies ofseveral plastid genes and theorganisation of plastid genomessuggest that the plastids of thesegroups form a single lineagespecifically related tocyanobacteria. So primary plastidendosymbiosis seems to havehappened just once in eukaryoticevolution, with the host being acommon ancestor of these threegroups. Phylogenetic analyses,particularly some centred aroundthe gene for elongation factor 2indicate that ‘reds’ and ‘greens’ areclosely related, with glaucophytesperhaps being their sister group. Atpresent we follow many in the field,and refer to this whole group as‘Plantae’, but we caution that mostbotanists use ‘Plantae’ for subsetsof this group, such as green algaeplus land plants.

The incorporation of the primaryplastid had a huge effect on thegenetic potential and basic biologyof the host organisms. Almost allPlantae are specialist phototrophs;a few are non-photosyntheticparasites, but even theseorganisms retain plastids in areduced form. Plantae are the onlyone of the major groups that maylack entirely the ability to engulfparticulate food.

Multicellularity has evolved onseveral occasions within Plantae:probably once in red algae, but

multiple times within green algae.One particular multicellularassemblage, the ‘charophytes’,actually gave rise to theembryophytes that dominate landhabitats, but are of very minorimportance in the ocean (where, infact, extremely small unicellulargreen algae are significant).

ChromalveolataIn ‘secondary endosymbiosis’ aeukaryote already containing aprimary plastid is engulfed byanother host eukaryote, and overtime is reduced to an organelle.The new plastid-containing host istermed a ‘secondary alga’.Secondary endosymbiosis hashappened more than once ineukaryotes, but mounting evidencefrom plastid gene trees and adistinctive gene replacement eventsuggests that most groups ofsecondary algae descend from oneparticular endosymbiosis involvinga red algal symbiont. Theseorganisms, plus their many non-photosynthetic relatives, comprisethe group Chromalveolata.

The chromalveolates unites fourmajor groups of eukaryotic algae:dinoflagellates, cryptophytes,haptophytes and stramenopiles(~heterokonts), and many non-photosynthetic forms (see below).The first three groups areunicellular, with a few colonialforms. Stramenopiles, however,range from tiny unicells, through toelaborate unicells and colonies, forexample diatoms, and trulymulticellular and massive lifeforms, such as kelps.Dinoflagellates and diatoms are thedominant ‘large’ phytoplankton inthe ocean.

Dinoflagellates andstramenopiles also include a widediversity of heterotrophic forms(and mixotrophs, organisms thatsubsist by both photosynthesisand heterotrophy). Heterotrophicstramenopiles are very importantconsumers of bacteria in aquaticenvironments, but also includesome animalparasites/commensals, and adiversity of fungal-like forms. Forexample, the Irish potato faminepathogen, Phytophthora infestans,is an oomycete stramenopile.Heterotrophic and mixotrophicdinoflagellates are important

Current Biology Vol 14 No 17R694

Figure 1. A diagrammatic tree depicting the organisation of most eukaryotes into six majorgroups. The relationships amongst most of the major groups and the position of the ‘root’of the tree are shown as unresolved (note however, the grouping of Opisthokonta andAmoebozoa). The arrow shows a possible precise placement of the root, based on genefusion data (see text).

ArchaebacteriaEubacteria

(Lobose amoebae)

Mycetozoan slime moulds

Pelobionts,Entamoebae

Apusomonads

Centrohelid Heliozoa

RadiolariaGreen algae

+ Plants

Red algaeGlaucophytes

PlantaeAlveolates Stramenopiles

CryptophytesHaptophytes

Chromalveolata

Diplomonads, Retortamonads,CarpediemonasParabasalids

Animals

Fungi (inc.microsporidia)

Choano-flagellates

IchthyosporeaNucleariid amoebae

Opisthokonta

Amoebozoa

Euglenozoa, Heterolobosea, Jakobids

Malawimonas

Oxymonads, Trimastix Excavata

Rhizaria

Eukaryotes

Prokaryotes

Cercozoa (inc. Foraminifera?)

Collodictyonids?

Current Biology

~2002root

DHFR-TS gene fusion

Stechmann & Cavalier-Smith (2002) Science 297:89-91; Simpson & Roger (2004) Curr Biol 14:R693-696

‘unikonts’vs. ‘bikonts’

basalia), suggesting dependence on exoge-nous thymidine (10-12), and the gene isabsent from the virtually complete Giardiagenome. Possibly these parasites have lostthe fusion gene. rRNA trees not using long-branching archaebacteria as an outgroupplace Parabasalia and Metamonada togetherwith high bootstrap support (5, 13, 14) andoften place Archezoa as sisters to Percolozoa.The complex tetrakont ciliary apparatus ofArchezoa and Percolozoa was long an obsta-cle to considering them the most primitiveeukaryotes; more likely, they evolved fromsimpler biciliate eukaryotes (5, 9, 14). Asparabasalids and diplomonads share two de-rived laterally transferred genes of cyanobac-terial origin [glucokinase (GK) and glucose-phosphate isomerase (GPI); Fig. 1] (15, 16),Archezoa are almost certainly holophyletic,so the root cannot lie within them, whetherbeside metamonads, as rRNA trees rooted onarchaebacteria suggested, or Parabasalia, assuggested by two single amino-acid inser-

tions in enolase that are shared with pro-karyotes (7). The enolase insertions are farweaker evidence than the DHFR-TS fusion,as they could easily have been secondarilyacquired by replication slippage mutations orby a single gene conversion using as templatea bacterial enolase gene taken into thephagotrophic ancestor of Parabasalia after itdiverged from Metamonada and other exca-vates. Sequence evidence that retortamonadsarose from diplomonads (17) eliminates themas candidate “early amitochondrial eu-karyotes” by placing them firmly within theancestrally tetrakont metamonads.

Jakobid Loukozoa have also been con-sidered possible primitive eukaryotes be-cause their mitochondria retain more pro-karyotic features than others (8); retentionof the proteobacterial RNA polymerase byReclinomonas does not mean that it is themost ancient eukaryote—if the viral-typepolymerase that replaced it was present inthe ancestral eukaryote, the bacterial en-

zymes could have been lost independentlyin several lineages.

Although most published distance treesdo not show excavate monophyly, probablybecause of long-branch problems, one max-imum likelihood rRNA tree not rooted onarchaebacteria shows the monophyly of ex-cavates including Reclinomonas (5), withreasonably strong bootstrap support (87%)in the corresponding distance tree. In viewof this and of the ultrastructural unity ofexcavates and their complex ciliary roots(5), we predict that jakobid Loukozoa alsohave the gene fusion. Whether Loukozoaare monophyletic or polyphyletic is uncer-tain, as is the precise position of excavates.But given the strong evidence for bikontmonophyly, no uncertainty in their internalbranching order would be relevant to ourconclusion unless an unstudied lineageconsistently branches closer to Amoebozoathan any other bikonts and also has separateDHFR and TS genes.

Like shared laterally transferred genes,symbiogenetic acquisitions of chloroplastsare derived characters that help narrow downthe position of the eukaryotic root, for itcannot lie within any group created by asingle such event. Thus the root cannot bewithin Plantae or chromalveolates (5, 18). Ifeuglenoids and chlorarachneans got theirchloroplasts in one event in a common ances-tor, for which good evolutionary argumentsexist (18), then it cannot lie in any of itsdescendants (i.e., not within excavates or theCercozoa/Retaria clade; Fig. 1). Even ifeuglenoids acquired chloroplasts indepen-dently (5), an early timing of that event wouldexclude the root from some groups: the gndgene of plastid affinity in Percolozoa (19)implies a photosynthetic common ancestor ofPercolozoa and Euglenozoa, ruling out bothdiscicristates and Archezoa, if the latter reallyare sisters of Percolozoa (Fig. 1) [even thehighly divergent gnd of Archezoa (19) mighthave the same origin].

Although the DHFR-TS gene fusion isthe strongest available evidence for theposition of the eukaryote root—among uni-ciliate protozoa (5), independent corrobo-ration is desirable to rule out the theoreticalpossibility of its reversal in opisthokonts(and possibly Amoebozoa). Such reversalmight in principle occur by horizontalacquisition from bacteria of separate DHFRand/or TS genes and partially or totallydeleting the fusion gene. Our multiplealignments argue clearly against this; de-rived signature sequences indicate that theseparate opisthokont DHFR and TS genesare distinctly more similar to the fusiongenes than to the separate bacterial genes.Unfortunately, both genes are too short tomake robust trees.

A unique insertion in EF-1! of animals

Fig. 1. Phylogenetic rela-tionships of the major eu-karyote groups [modifiedfrom (5)]. Opisthokontshave separate DHFR andTS genes. Asterisks markall eight groups positive forthe bifunctional DHFR-TSfusion gene. The origin ofthe fusion gene at thesame time as the bikontflagellates is indicated ["and # mark primer sitesused here; for details and alist of the nine taxa stud-ied see (4)]. Triangles onthe gene organization dia-grams indicate the posi-tion of translation initia-tion codons. Although thebranching order amongthe bikont groups is uncer-tain in places (notably theprecise position of Helio-zoa and excavates), theseuncertainties are irrelevantto the conclusion that theroot lies below the com-mon ancestor of all thesebikont groups. The mono-phyly and internal branch-ings of excavates are alsouncertain, involving bothsequence and ultrastruc-tural evidence (5); the to-pology (not the rooting) ofthis tree is congruent withseveral (not all) recent se-quence trees (e.g., 2, 5, 6,9). The uncertain position of Amoebozoa relative to the root is emphasized by a dashed branch.Symbiogenetic events that created eukaryotic algae are shown: the primary origin of chloroplasts fromcyanobacteria to form the plant kingdom (23); the secondary symbiogenetic implantation of a red algalcell (circled R) into a heterotrophic host to form chromalveolates (5) and lateral transfer of green algalchloroplasts to create euglenoid and chlorarachnean algae—whether both got plastids in a singleancestral event [circled G (18)] or separately [circled plus sign (5)] is uncertain. Miozoa comprisedinoflagellates, Sporozoa and protalveolates; Loukozoa include jakobids and anaeromonads; Retariacomprise Foraminifera and Radiolaria (5).

R E P O R T S

5 JULY 2002 VOL 297 SCIENCE www.sciencemag.org90

basalia), suggesting dependence on exoge-nous thymidine (10-12), and the gene isabsent from the virtually complete Giardiagenome. Possibly these parasites have lostthe fusion gene. rRNA trees not using long-branching archaebacteria as an outgroupplace Parabasalia and Metamonada togetherwith high bootstrap support (5, 13, 14) andoften place Archezoa as sisters to Percolozoa.The complex tetrakont ciliary apparatus ofArchezoa and Percolozoa was long an obsta-cle to considering them the most primitiveeukaryotes; more likely, they evolved fromsimpler biciliate eukaryotes (5, 9, 14). Asparabasalids and diplomonads share two de-rived laterally transferred genes of cyanobac-terial origin [glucokinase (GK) and glucose-phosphate isomerase (GPI); Fig. 1] (15, 16),Archezoa are almost certainly holophyletic,so the root cannot lie within them, whetherbeside metamonads, as rRNA trees rooted onarchaebacteria suggested, or Parabasalia, assuggested by two single amino-acid inser-

tions in enolase that are shared with pro-karyotes (7). The enolase insertions are farweaker evidence than the DHFR-TS fusion,as they could easily have been secondarilyacquired by replication slippage mutations orby a single gene conversion using as templatea bacterial enolase gene taken into thephagotrophic ancestor of Parabasalia after itdiverged from Metamonada and other exca-vates. Sequence evidence that retortamonadsarose from diplomonads (17) eliminates themas candidate “early amitochondrial eu-karyotes” by placing them firmly within theancestrally tetrakont metamonads.

Jakobid Loukozoa have also been con-sidered possible primitive eukaryotes be-cause their mitochondria retain more pro-karyotic features than others (8); retentionof the proteobacterial RNA polymerase byReclinomonas does not mean that it is themost ancient eukaryote—if the viral-typepolymerase that replaced it was present inthe ancestral eukaryote, the bacterial en-

zymes could have been lost independentlyin several lineages.

Although most published distance treesdo not show excavate monophyly, probablybecause of long-branch problems, one max-imum likelihood rRNA tree not rooted onarchaebacteria shows the monophyly of ex-cavates including Reclinomonas (5), withreasonably strong bootstrap support (87%)in the corresponding distance tree. In viewof this and of the ultrastructural unity ofexcavates and their complex ciliary roots(5), we predict that jakobid Loukozoa alsohave the gene fusion. Whether Loukozoaare monophyletic or polyphyletic is uncer-tain, as is the precise position of excavates.But given the strong evidence for bikontmonophyly, no uncertainty in their internalbranching order would be relevant to ourconclusion unless an unstudied lineageconsistently branches closer to Amoebozoathan any other bikonts and also has separateDHFR and TS genes.

Like shared laterally transferred genes,symbiogenetic acquisitions of chloroplastsare derived characters that help narrow downthe position of the eukaryotic root, for itcannot lie within any group created by asingle such event. Thus the root cannot bewithin Plantae or chromalveolates (5, 18). Ifeuglenoids and chlorarachneans got theirchloroplasts in one event in a common ances-tor, for which good evolutionary argumentsexist (18), then it cannot lie in any of itsdescendants (i.e., not within excavates or theCercozoa/Retaria clade; Fig. 1). Even ifeuglenoids acquired chloroplasts indepen-dently (5), an early timing of that event wouldexclude the root from some groups: the gndgene of plastid affinity in Percolozoa (19)implies a photosynthetic common ancestor ofPercolozoa and Euglenozoa, ruling out bothdiscicristates and Archezoa, if the latter reallyare sisters of Percolozoa (Fig. 1) [even thehighly divergent gnd of Archezoa (19) mighthave the same origin].

Although the DHFR-TS gene fusion isthe strongest available evidence for theposition of the eukaryote root—among uni-ciliate protozoa (5), independent corrobo-ration is desirable to rule out the theoreticalpossibility of its reversal in opisthokonts(and possibly Amoebozoa). Such reversalmight in principle occur by horizontalacquisition from bacteria of separate DHFRand/or TS genes and partially or totallydeleting the fusion gene. Our multiplealignments argue clearly against this; de-rived signature sequences indicate that theseparate opisthokont DHFR and TS genesare distinctly more similar to the fusiongenes than to the separate bacterial genes.Unfortunately, both genes are too short tomake robust trees.

A unique insertion in EF-1! of animals

Fig. 1. Phylogenetic rela-tionships of the major eu-karyote groups [modifiedfrom (5)]. Opisthokontshave separate DHFR andTS genes. Asterisks markall eight groups positive forthe bifunctional DHFR-TSfusion gene. The origin ofthe fusion gene at thesame time as the bikontflagellates is indicated ["and # mark primer sitesused here; for details and alist of the nine taxa stud-ied see (4)]. Triangles onthe gene organization dia-grams indicate the posi-tion of translation initia-tion codons. Although thebranching order amongthe bikont groups is uncer-tain in places (notably theprecise position of Helio-zoa and excavates), theseuncertainties are irrelevantto the conclusion that theroot lies below the com-mon ancestor of all thesebikont groups. The mono-phyly and internal branch-ings of excavates are alsouncertain, involving bothsequence and ultrastruc-tural evidence (5); the to-pology (not the rooting) ofthis tree is congruent withseveral (not all) recent se-quence trees (e.g., 2, 5, 6,9). The uncertain position of Amoebozoa relative to the root is emphasized by a dashed branch.Symbiogenetic events that created eukaryotic algae are shown: the primary origin of chloroplasts fromcyanobacteria to form the plant kingdom (23); the secondary symbiogenetic implantation of a red algalcell (circled R) into a heterotrophic host to form chromalveolates (5) and lateral transfer of green algalchloroplasts to create euglenoid and chlorarachnean algae—whether both got plastids in a singleancestral event [circled G (18)] or separately [circled plus sign (5)] is uncertain. Miozoa comprisedinoflagellates, Sporozoa and protalveolates; Loukozoa include jakobids and anaeromonads; Retariacomprise Foraminifera and Radiolaria (5).

R E P O R T S

5 JULY 2002 VOL 297 SCIENCE www.sciencemag.org90

Page 25: Denis BAURAIN

Phylogenomics

!""# $%&%'()*+',-*./#01#-2$$&'3'4.1#5"!"

#6789:;<7=>#<6#?@AB<C:D:EFG# FD6:7:DG:# H>::# E@:# :IG@8DC:#J:E;::D# 9A>:B6# 8DK# L(M4# N%.'-/# FD# ?87EFGOB87P# N%.'-/#:E#8BQ#5""5R#S*4*4)%T'3(4)-#:E#8BQ#5""0R#S*4*4)%T'3(4)-#5""UJR##N%.'-/#:E#8BQ#5""UVQ#W7FEFGF>9>#<6#>O?:7E7::>#78DC:#67<9# >@<7EG<9FDC># FD# >?:GF!#G# F9?B:9:DE8EF<D># <6# E@:#9:E@<K# E<# ?:7G:FX:K# >@<7EG<9FDC># E@8E# 87:# 6ODK89:DE8B#E<#E@:#9:E@<K#FE>:B6Q#%9<DC#E@:#B8EE:71#E@:#=:A#G<DG:7D#F>#E@8E#>O?:7E7::>1#JA#G<9JFDFDC#8DK#8D8BAYFDC#E@:#E7::>#K:T7FX:K#67<9#G@878GE:7#K8E8# 78E@:7# E@8D#8D8BAYFDC# E@:#K8E8#KF7:GEBA1#7:?7:>:DE#8#9:E8T8D8BA>F>#<D:#>E:?#7:9<X:K#67<9#E@:#7:8B#K8E8#HN%.'-/#Z#-$,*4N',#5""UVQ#.@:#FD@:7:DE#B<>>#<6#FD6<798EF<D#E@F>#68GE#:DE8FB>#8OE<98EFG8BBA#E78D>B8E:>#6<7#><9:#E<#8D#FD@:7:DE#K:G7:8>:#FD#8GGO78GA#G<9?87:K#E<#8#>O?:798E7FI#8D8BA>F>Q#*D>E:8K1#E@:#9:E8T8D8BA>F>#D8EO7:#<6#>O?:7E7::>#G8D#8B><#J:#XF:;:K#8>#8#?<E:DEF8B#>E7:DCE@Q#S:TG8O>:#F>>O:>#<6#G@878GE:7#K8E8#G<9JFD8JFBFEA#K<#D<E#866:GE#>O?:7E7::#G<D>E7OGEF<D1#9<7:#<6#E@:#E<E8B#?@AB<C:D:EFG#K8TE8J8>:#G8D#J:#O>:K#E<#K:7FX:#E@:#:X<BOEF<D87A#E7::>Q#.@F>#68GE1# FD# B87C:#9:8>O7:1# 8GG<ODE># 6<7# E@:#8JFBFEA#<6# >O?:7T

E7::>#E<#<JE8FD#9<7:#G<9?7:@:D>FX:#?@AB<C:D:EFG#E7::>#6<7#9<>E#C7<O?># E@8D# F>#GO77:DEBA#?<>>FJB:#O>FDC#8#>O?:798TE7FI# 8??7<8G@Q#'X:D# ><1# FE# F># @:BK#JA# ><9:# E@8E# ?@AB<C:TD:EFG# >O?:7E7::>#9:7:BA# 7:?7:>:DE# 8# >E<?C8?#9:8>O7:# 6<7#?@AB<C:D:EFG#FD6:7:DG:#ODEFB#>O6!#GF:DE#9<B:GOB87#K8E8#J:TG<9:#8X8FB8JB:#E<#:D8JB:#E@:#9<7:#K:>F78JB:#>O?:798E7FI#8D8BA>:>Q#[FE@#E@:#:X:7TFDG7:8>FDC#?8G:#8DK#:X:7TK:G7:8>TFDC#G<>E>#<6#@FC@#E@7<OC@?OE#>:\O:DGFDC1#E@F>#<?FDF<D#K<:>#@8X:#8#G:7E8FD#X8BFKFEA#E<#FE1#8BJ:FE#9OG@#9<7:#><#6<7#G@87TF>98EFG#C7<O?># H:Q#CQ1#98998B>#<7#"#<;:7FDC#?B8DE>V# E@8D#6<7#<E@:7#687#B:>>#;:BB#>EOKF:K#<D:>#H:Q#CQ1#7<EF6:7>#8DK#98DA#<E@:7#9FG7<68OD8B#E8I8VQ[@8E# E@:D#K<:># E@:# 6OEO7:#@<BK#6<7#>O?:7E7::>1# F6#8DT

AE@FDC]# %># *# 87CO:K# ><9:# A:87># 8C<# HS*4*4)%T'3(4)-#5""U8V1# E@:# 8??BFG8EF<D# <6# 8# >O?:7E7::# 6789:;<7=# ;FBB#C78KO8BBA#>@F6E#67<9#FE>#E78KFEF<D8B#8??BFG8EF<D#<6#G<9JFDTFDC# ><O7G:# E7::># <JE8FD:K# 67<9# E@:# BFE:78EO7:# E<# J:G<9:#9<7:# FDE:C78E:K# ;FE@# E@:# >O?:798E7FI# 6789:;<7=# H>::##^FCQ# !VQ# .@:# EF9FDC# <6# E@F># >@F6E# K:?:DK># B87C:BA# <D# E@:#

!"#$%&Q#S7:8=FDC#K<;D#E@:#;8BB#J:E;::D#>O?:798E7FI#HB:6EV#8DK#>O?:7E7::#H7FC@EV#8D8BA>:>Q#*D>E:8K#<6#J:FDC#J8>:K#<D#KF>EFDGE#K8E8#>:E>#HE<?#B:6E#8DK#E<?#7FC@E#7:>?:GEFX:BAV1#J<E@#6789:;<7=>#;FBB#8D8BAY:#E@:#>89:#H9<B:GOB87V#K8E8#>:E#HE<?#B:6EV#FD#E@:#6OEO7:Q#%#>O?:7E7::#8D8BA>F>#<6#?87EFEF<D>#FD#E@F>#K8E8#>:E#;FBB#E@:D#J:#G<9?87:K#KF7:GEBA#E<#E@:#>O?:798E7FI#><BOEF<D#FD#8#CB<J8B#G<DC7O:DG:#6789:;<7=#HJ<EE<9#9FKKB:V#8DK_<7#O>:K#E<#>::K#8#>O?:798E7FI#8D8BA>F>#8>#?87E#<6#8#KFXFK:T8DKTG<D\O:7#>E78E:CA#H9FKKB:VQ

(Supermatrix)

Bininda-Emonds (2010) Palaeodiversity 3 (Suppl.):99-106

Page 26: Denis BAURAIN

Supermatrix Assembly74 L’inference phylogenomique

S1

S1 A C G T C A A G

S2 A C - T C C A G

S3 A C - T C G A C

S2

S1 T G G - - T

S3 A G C T C C

S4 A G C T C G

S3

S1 C G G A C T A C G T

S4 C C C T - - - - G G

S5 C G T T C G A C G T

S1 S2 S3

S1 A C G T C A A G T G G - - T C G G A C T A C G T

S2 A C - T C C A G . . . . . . . . . . . . . . . .

S3 A C - T C G A C A G C T C C . . . . . . . . . .

S4 . . . . . . . . A G C T C G C C C T - - - - G G

S5 . . . . . . . . . . . . . . C G T T C G A C G T

FIG. 4.1 – Exemple de supermatrice de caracteresA partir de trois alignements de sequences d’ADN S1, S2 et S3 (en haut), une supermatrice de caractere a ete construite

en concatenant les trois alignements de sequences et en figurant les etats de caractere manquants par un point (en bas).

4.1.2 Analyse simultanee a partir d’une supermatrice de caracteres

Il n’est pas rare d’observer des differences topologiques entre deux arbres phylogenetiques

definis sur le meme ensemble d’especes mais inferes a partir de deux genes differents. Ce

fait, nomme incongruence entre genes, est souvent provoque par le phenomene d’homoplasie

(Sober, 1988). Par homoplasie, on entend l’apparition independante d’etats de caractere simi-

laires chez des taxons eloignes, impliquant souvent un phenomene d’attraction des longues

branches, en particulier si on utilise les criteres MP ou de distance. L’homoplasie est subdi-

visee en convergence (apparition independante d’un meme etat de caractere) et reversion (ap-

parition d’un etat de caractere ayant l’apparence d’un etat de caractere ancestral). Une forte

heterogeneite des taux d’evolution entre sites est une des causes de l’homoplasie lorsque l’on

considere des sequences de caracteres moleculaires. Les transferts horizontaux de genes sont

egalement responsables de l’incongruence entre genes. Ainsi, lorsque l’on souhaite inferer un

arbre phylogenetique a partir d’une collection de genes, il est souvent recommande d’effectuer

des tests d’incongruence entre genes, e.g. test ILD (Farris et al., 1995). Une premiere approche

consiste a eliminer du jeu de donnees initial le(s) gene(s) presentant une forte incongruence par

rapport aux autres. Une autre approche revient a detecter et eliminer seulement les etats de ca-

racteres responsables de l’incongruence. Ces deux approches ne sont pas incompatibles avec

le principe TE (Lecointre and Deleporte, 2005).

Les techniques de combinaison basse peuvent souvent etre handicapees par

l’apparition de nombreuses donnees manquantes (e.g. 46% d’etats de caractere

manquants dans l’exemple de la Figure 4.1). Les etudes de phylogenomiques

recentes basees sur la combinaison basse offrent souvent plus de 50% d’etats de

tel-0

0142

222,

ver

sion

1 -

17 A

pr 2

007

Delsuc et al. (2005) Nat Rev Genet 6:361-375; Criscuolo (2006) Thèse de Doctorat

Page 27: Denis BAURAIN

modifié de Roger & Simpson (2009) Curr Biol 19:R165-7

Lack of Resolution

cytokinesis and are enhanced by Rhoand suppressed by Rac. J. Cell Biol. 166,61–71.

16. Severson, A.F., Baillie, D.L., and Bowerman, B.(2002). A formin homology protein and aprofilin are required for cytokinesis andArp2/3-independent assembly of corticalmicrofilaments in C. elegans. Curr. Biol. 12,2066–2075.

17. Zhang, W., and Robinson, D.N. (2005). Balanceof actively generated contractile and resistiveforces controls cytokinesis dynamics. Proc.Natl. Acad. Sci. USA 102, 7186–7191.

18. Yamada, T., Hikida, M., and Kurosaki, T. (2006).Regulation of cytokinesis by mgcRacGAP in Blymphocytes is independent of GAP activity.Exp. Cell Res. 312, 3517–3525.

19. Schonegg, S., Constantinescu, A.T., Hoege, C.,and Hyman, A.A. (2007). The Rho GTPase-activating proteins RGA-3 and RGA-4 arerequired to set the initial size of PAR domainsin Caenorhabditis elegans one-cellembryos. Proc. Natl. Acad. Sci. USA104, 14976–14981.

20. Schmutz, C., Stevens, J., and Spang, A. (2007).Functions of the novel RhoGAP proteins RGA-3

and RGA-4 in the germ line and in the earlyembryo of C. elegans. Development 134,3495–3505.

Department of Molecular Genetics and CellBiology, University of Chicago, Chicago,IL 60637, USA.E-mail: [email protected]

DOI: 10.1016/j.cub.2008.12.028

Evolution: Revisiting the Root of theEukaryote Tree

A recent phylogenomic investigation shows that the enigmatic flagellateBreviata is a distinct anaerobic lineage within the eukaryote super-groupAmoebozoa and challenges the unikont–bikont rooting of the tree ofeukaryotes.

Andrew J. Roger1,2

and Alastair G.B. Simpson1,3

In the 1980s and 1990s, prevailingviews of the eukaryote tree of life werestrongly influenced by phylogenies ofsmall subunit ribosomal RNA (rRNA)genes [1]. Although these analysesplaced many eukaryotes into majorgroups, it became clear that therelationships amongst these groupscould not be determined becauseof the limited information availablein a single gene, as well asmethodological artifacts [2]. Morerecently, a ‘six super-groups’hypothesis for deep eukaryotephylogeny emerged as a synthesis ofanalyses of sequence data for rRNAs,concatenated sets of conservedproteins and organellar genomes,and some detailed ultrastructuralcomparisons [3]. The six super-groupsproposed are the opisthokonts,Amoebozoa, Archaeplastida,chromalveolates, Rhizaria andExcavata. In the absence of outgroupsequences that are sufficientlyclosely related to allow reliablerooting of eukaryotes in molecularphylogenies, Cavalier-Smith andcolleagues proposed that thepresence or absence ofa dihidrofolate reductase-thymidylatesynthase (DHFR-TS) gene fusion [4]and specific myosin gene families [5]in diverse eukaryotes could be usedto infer that the eukaryote root fallsbetween so-called ‘unikonts’(opisthokonts and Amoebozoa) and

‘bikonts’ (all other super-groups),as shown in Figure 1. Both the sixsuper-groups model and theunikont–bikont root hypothesis have

been controversial since they werefirst proposed [6]. Now, with therapid accumulation of genome-scaledata for diverse protist species,a flurry of phylogenomic analyses[7–9] are putting these hypothesesto the test.A recent paper by Minge et al. [7]

reports phylogenomic analyses of theenigmatic protist Breviata anathema,a small amoeba-like cell with ananterior flagellum. Breviata isinteresting for two major reasons: itlives in low oxygen conditions and

Current BiologyEubacteria Archaebacteria

ProkaryotesEukaryotes

Slime moulds

Phalansterium

Archamoebae

inc. Animals and Fungi

Rhizaria

Archaeplastida(= Plantae)

Alveolata

ExcavataOpisthokonts

Amoebozoa

Stramenopiles

Cryptomonads +Haptophytes

22 2 2

22

2

Loboseamoebae

Multicilia

Apusomonads

10

11

0

1/2

2*

2+

‘Unikonts’ ‘Bikonts’

Breviata

Figure 1. The placement of Breviata anathema in the eukaryote tree of life.

The relationships amongst the six super-groups of eukaryotes are shown as recovered byMingeet al. [7] and other recent phylogenomic analyses [8,9]. The hypothesized super-groups arecolour-coded as follows: opisthokonts (purple), Amoebozoa (light blue), Archaeplastida (green),chromalveolates (orange), Rhizaria (dark blue) and Excavata (brown). Note that recent evidencesuggests that Rhizaria are specifically related to some chromalveolates [8,9]. The tree is shownas rooted according to the unikont–bikont hypothesis [5,14]. Anaerobic/microaerophilicprotistan lineages that lack classical mitochondria are shown in red. The numbers in ellipsesshow the inferred ancestral number of basal bodies per kinetid (flagellar unit) in the variouseukaryote lineages. The plus (+) indicates that Breviata may contain more basal bodies thanthe number cited whereas the asterisk (*) indicates that one basal body is non-flagellated.Dashed lineages indicate uncertainty in the location of that branch on the tree (see text).

DispatchR165

~2010

Page 28: Denis BAURAIN

possible ‘big-bang’ radiation of Eukaryotes(worsened by phylogenetic artefacts)

Issues (1)

Page 29: Denis BAURAIN

horizontal gene transfer (HGT) due to« You are what you eat »

Issues (2)

support for HGT having played a signif-icant role in the evolution of thisorganism.

Because all of the genes studied byArchibald et al. are encoded within thenucleus of B. natans but operate withinthe plastid, the signal and transit pep-tides are absolutely necessary. One cansurmise that this necessity wouldstrongly favor HGT in and among algae,where nuclear-encoded genes targetedto plastids must navigate a similar mazeof endomembranes through the direc-tion of transit peptides, and additionalsignal peptides in the case of other sec-ondary algae. Indeed, a majority ofgenes studied by the authors supportthis expectation, favoring phylogeniesconsistent with HGT from streptophytesor red algae into the B. natans genome.

Most intriguingly, two of the genes fromtheir analysis indicate HGT from differ-ent bacteria, significant not only as anexample of prokaryote-to-eukaryotegene transfer but also because these ac-quired genes initially would have nothad the proper leader sequence forimport into the plastid. Whether theappropriate targeting sequence was in-corporated de novo through gene con-version or some other mechanism ofhomologous or orthologous replacementis not clear, but this remarkable findingcertainly invokes new ideas on howgenes are assimilated into a genome.

Microbiologists have long knownabout phenotypes that favor promiscu-ous plasmid sharing among bacteria,responsible for the epidemic spread ofantibiotic resistance. Although no plas-

mid analog exists in eukaryotes,Archibald et al. suggest that HGT in B.natans may occur in the same way it hasfor the many endosymbiotic events thathave happened over the past 2 billionyears, by engulfing other organisms (Fig.1). Compared with the green algaChlamydomonas reinhardtii, which isphotoautotrophic and in which no paral-lel evidence of HGT is found, B. natansis mixotrophic, meaning that it can livephagocytotically and photosynthetically.Models have been proposed wherebysmall snippets of DNA from an en-gulfed microbe are able to escape diges-tion, e.g., from protist lysozomes, andmigrate to and subsequently be incorpo-rated into the host genome (6, 16). Onecan imagine the series of fleetingly smallprobability events proceeding from en-gulfment to incorporation of a strand offoreign DNA into the genome to a newgene overcoming genetic drift to be-come fixed in the population. In a cer-tain sense, this is the same cumulativeeffect as random mutations in singlegenes or dripping water, but it now op-erates on a different level, an entiregene. However, the essential point isthat those probabilities are nonzero andover time have made a significant con-tribution to the genome of this organism(6). The real boon of Archibald et al.’shypothesis, perhaps best synopsized byFord Doolittle’s epigram ‘‘you are whatyou eat’’ (16), is that it is eminently test-able as more eukaryotic genome databecome available. It is already apparentthat the magnitude of HGT varies dra-matically in different lineages of algae,with the proposed explanation of thephagocytotic lifestyle as a likely but notproven explanation for the observedmosaic pattern. Whether this is a moregeneral mechanism for HGT in a widerrange of eukaryotes, including nonalgaltaxa, is not yet apparent.

So although Nature herself may notmake leaps, it now seems clear thatmany organisms, eukaryotes and pro-karyotes, are certainly able to mimicevolutionary jumps through HGT.

1. Doolittle, W. F. (1999) Trends Cell Biol. 9,M5–M8.

2. Gogarten, J. P., Doolittle, W. F. & Lawrence, J. G.(2002) Mol. Biol. Evol. 19, 2226–2238.

3. Ochman, H., Lawrence, J. G. & Groisman, E. A.(2000) Nature 405, 299–304.

4. Avery, O. T., MacLeod, C. M. & McCarty, M.(1944) J. Exp. Med. 79, 137–159.

5. Archibald, J. M., Rogers, M. B., Toop, M., Ishida,K.-i. & Keeling, P. J. (2003) Proc. Natl. Acad. Sci.USA 100, 7678–7683.

6. Doolittle, W. F., Boucher, Y., Nesbo, C. L.,

Douady, C. J., Andersson, J. O. & Roger, A. J.(2003) Philos. Trans. R. Soc. London B 358, 39–57;discussion 57–58.

7. Stiller, J. W. & Hall, B. D. (1997) Proc. Natl. Acad.Sci. USA 94, 4520–4525.

8. Stiller, J. W., Riley, J. & Hall, B. D. (2001) J. Mol.Evol. 52, 527–539.

9. Kirschvink, J. L., Gaidos, E. J., Bertani, L. E.,Beukes, N. J., Gutzmer, J., Maepa, L. N. &Steinberger, R. E. (2000) Proc. Natl. Acad. Sci.USA 97, 1400–1405.

10. McFadden, G. I. (2001) J. Phycol. 37, 951–959.

11. Douglas, A. E. & Raven, J. A. (2003) Philos.Trans. R. Soc. London B 358, 5–17; discussion17–18.

12. Douglas, S. E. & Gray, M. W. (1991) Nature 352,290 (lett.).

13. Douglas, S. E. (1992) Biosystems 28, 57–68.14. McFadden, G. I. (1999) J. Eukaryotic Microbiol.

46, 339–346.15. Deane, J. A., Fraunholz, M., Su, V., Maier, U. G.,

Martin, W., Durnford, D. G. & McFadden, G. I.(2000) Protist 151, 239–252.

16. Doolittle, W. F. (1998) Trends Genet. 14, 307–311.

Fig. 1. Stepwise conceptual image of one mechanism of HGT that, as proposed by Archibald et al. (5),might operate in the chlorarachniophyte alga B. natans. Red arrows show phagocytosis and subsequentdigestion of a bacterium or protist, from which foreign DNA has survived digestion and becomeincorporated into the algal nucleus (flow of HGT-acquired genetic information indicated with bluearrows). Although the genes studied herein by Archibald et al. are directed for function to the plastid, thesignificant number of horizontally transferred genes they found may only be the tip of the iceberg inphagocytic protists such as B. natans.

7420 ! www.pnas.org"cgi"doi"10.1073"pnas.1533212100 Raymond and Blankenship

Page 30: Denis BAURAIN

a distantly related photosynthetic eukaryote whose plastid

evolved directly from the cyanobacterial plastid progenitor.

Inferring how many times the ‘primary’ plastids of red algae,

green algae (and plants) and glaucophyte algae evolved into

‘secondary’ plastids is an area of active investigation and

debate.(22–25) No secondary plastids derived from glauco-

phytes are known, but both green and red algae have, each at

the very least on one occasion, been captured and converted

into a secondary plastid (Fig. 2). This process involves a

second round of EGT, this time from the endosymbiont

nucleus to that of the secondary host (Fig. 1), as well as the

evolution of another protein import pathway built on top of that

used by primary plastids.(5,26) For these reasons, successful

integration of a secondary endosymbiont is thought by many

to be difficult to achieve, and secondary endosymbiosis is

thus usually invoked only sparingly.

Secondary plastids of green algal origin occur in

euglenophytes and chlorarachniophytes, whereas most

plastids in so-called ‘chromalveolates’ are derived from red

algae. Chromalveolates include cryptophytes, haptophytes,

dinoflagellates, apicomplexans, the newly discovered coral

alga Chromera velia and stramenopiles (or heterokonts), the

latter being the group to which diatoms belong (Fig. 2). The

chromalveolate hypothesis was put forth as an attempt to

N2

N1

HGT

M2

M1

CB

HGT HGT

Secondary host

Plastidprogenitor

Primary host

EGTEGT

Figure 1. Endosymbiosis and gene flow in photosynthetic eukar-yotes. Diagram depicts movement of genes in the context of primaryand secondary endosymbiosis, beginning with the cyanobacterialendosymbiont (CB) that gave rise to modern-day plastids. Acquisitionof genes by horizontal (or lateral) gene transfer is always a possibility,and such genes can be difficult to distinguish from those acquired byendosymbiotic gene transfer (EGT). CB, cyanobacterium; HGT, hor-izontal gene transfer; MT, mitochondrion.

Archaeplastida

Apicomplexans +/-Chromera

Ciliates

Rhizaria

ForaminiferaOther Rhizaria

Other heterotrophs

Stramenopiles

Alveolates

GoniomonasKatablepharidsTelonemids

'Chromalveolates'

Perkinsids (?)

SAR

Centrohelids

Oomycetes

DiatomsOther phototrophs

Haptophytes

Cryptophytes

(Pico)biliphytes (?)

Chlorarachniophytes

Hacrobia

2o

Amoebozoa

Lobose amoebaeSlime mouldsIncludes

Excavata

Kinetoplastids

Heterolobosea

Euglenophytes

ParabasalidsDiplomonads

Jakobids

2o

Includes

OpisthokontsAnimals

Fungi

IncludesChoanoflagellates

3o

Green algae + plants

Red algaeGlaucophytes

1o

2o

HGT

HGT HGT

HGT

HGT

2oApusomonads,Other lineages

3o

Dinoflagellates +/- 3o

Figure 2. Origin and spread of photosynthesis across the eukaryotic tree of life. Diagram shows hypothesised ‘supergroups’ with emphasis onthose containing photosynthetic lineages. Some (but not all) lineages within the different supergroups are provided for context. The treetopology shown within the ‘chromalveolate’!SAR (Stramenopiles!Alveolates!Rhizaria) clade represents a synthesis of phylogenetic andphylogenomic data published as of 31 August 2009. Branch lengths do not correspond to evolutionary distance. Dashed lines indicateuncertainties with respect to the timing and/or directionality of secondary (28) or tertiary (38) endosymbiotic events, question marks (?) indicateuncertainty as to the presence of a plastid and ‘!/"’ indicates that both plastid-bearing (!) and plastid-lacking (") dinoflagellates andapicomplexans exist. Examples of green and red algal-derived tertiary plastids in dinoflagellates are known (see text for discussion). HGT,horizontal (or lateral) gene transfer.

What the papers say M. Elias and J. M. Archibald

1274 BioEssays 31:1273–1279, ! 2009 Wiley Periodicals, Inc.

Issues (3)

adapted from Elias and Archibald (2009) BioEssays 31:1273-1279

Plantae

endosymbiotic gene transfer (EGT) due tomultiple plastid transfers

Page 31: Denis BAURAIN

ANRV329-GE41-08 ARI 21 June 2007 22:19

Phagotrophic Plantae ancestor

PhotosyntheticPlantae ancestor

Cyanobacteria (prey)

Preydigestion

Primaryendosymbiosis

Preyretention

Plastid establishment

Figure 2Hypothetical model showing the primary endosymbiotic origin of theplastid in the ‘Plantae’ common ancestor.

Antiporter: anintegral membraneprotein that couplesthe active transportof two differentmolecules inopposite directionsacross themembrane, as in theplastid triosephos-phate/phosphateantiporter

ER: endoplasmicreticulum

cyanobacterium (10, 56, 103). The sisterof Paulinella chromatophora, Paulinella ovalis,lacks a plastid but is an active predator ofcyanobacteria that are localized in food vac-uoles in the cytoplasm (43). This observationprovides some support for the phagotrophicorigin of the ancient plastid. Molecularclock analyses using multigene data sets and“relaxed clock” approaches (e.g., penalizedlikelihood, Bayesian methods) that do notassume strict chronometric behavior of genesunder study suggest that the ‘Plantae’ primaryendosymbiosis is an ancient event in eukary-otic evolution. Although still controversial(23), recent analyses suggest that the primaryplastid was established ca. 1.5 billion years agoin the Mesoproterozoic (e.g., 14, 35, 59, 101).

Early Events in ‘Plantae’ PlastidEvolutionWe have hypothesized that a crucial earlystep in endosymbiosis must have been the es-

tablishment of a reliable connection betweenthe host cell and the ancestral plastid to allowthe controlled exchange of metabolic inter-mediates between the symbiotic partners.Regulated exchange is important because theunfettered flux of metabolites between thehost and plastid would have had detrimentaleffects on the metabolism of both partnersand thereby lowered the evolutionary fitnessof the symbiosis. A diverse set of metaboliteantiporters that are embedded in the innermembrane of current day plastids allows thecontrolled exchange of solutes between cellu-lar compartments (100). This antiport func-tion is dependent on the presence of a suitablecounterexchange substrate on the trans-siteof the membrane. It was recently shown thatthe plastid triosephosphate and related sugartransporters were established in the commonancestor of the red and green algae (and likelyall ‘Plantae’, supporting their monophyly), al-lowing this first alga to profit from cyanobac-terial carbon fixation (99). This evolutionarystep likely rendered irreversible the associ-ation between the plastid and the host cell.The ancestral plastid antiporter evolved froman existing metabolite translocator in the hostcell that had evolved due to the pre-existenceof mitochondria and an endomembrane sys-tem and was likely transferred to the ancientplastid via membrane fusion. This hypothesisof a close interaction between plastid enve-lope membranes and the host endomembranesystem is supported by the observation thatthe outer leaflet of the outer membrane ofextant primary plastids consists mainly ofER-derived phospholipids (17, 21). A morerecent analysis of 83 annotated plastid solutetransporters from Arabidopsis thaliana showsthat the majority of these genes that have aresolved phylogeny in this taxon and in other‘Plantae’, including all carbohydrate trans-porters, originated from co-option (i.e., geneduplication followed by retargeting to theplastid) of host genes, whereas about a quarterare of cyanobacterial (i.e., endosymbiont)provenance. The nuclear origin of manytransporters is supported by their absence in

150 Reyes-Prieto ·Weber · Bhattacharya

An

nu

. R

ev.

Gen

et.

20

07

.41

. D

ow

nlo

aded

fro

m a

rjo

urn

als.

ann

ual

rev

iew

s.o

rgb

y U

NIV

ER

SIT

E D

E L

IEG

E o

n 0

7/0

3/0

7.

Fo

r p

erso

nal

use

on

ly.

endosymbiose primairePlantae

endosymbiose secondaire« Chromalveolés »

ANRV329-GE41-08 ARI 21 June 2007 22:19

baEmiliania huxleyi

Isochrysis galbanaPrymnesium parvum

Karenia brevis

Thalassiosira pseudonanaPhaeodactylum tricornutum

Heterocapsa triquetraAlexandrium tamarenseAmphidinium carterae

Galdieria sulphurariaCyanidioschyzon merolae

EudicotArabidopsis thaliana

Oryza sativaChlamydomonas reinhardtiiBigelowiella natans

Nostoc sp. PCC7120Crocosphaera watsonii WH8501

Synechococcus elongatus PCC6301Gloeobacter violaceus PCC7421

Euglena gracilis

ChromalveolatesRed algaeCyanobacteria

Green algaeChlorarachniophyteEuglenid

Red algaeAlveolates

Stramenopiles

Haptophytes

Cryptophytes

Glaucophytealgae

Photosynthetic‘Plantae’ ancestor

AncestralChromalveolate

Euglenids

Chorarachniophytes

Greenalgae

Figure 4The origin(s) of plastids in photosynthetic eukaryotes. (a) Multiple lines of evidence (see text) support thesingle origin of the primary plastid in the ‘Plantae’ common ancestor. The plastid in red and green algaewas then transferred to chromalveolates, euglenids, and chlorarachniophyte amoebae via independentsecondary endosymbioses. (b) Phylogenetic tree based on maximum likelihood analysis of a data set of 6nuclear-encoded plastid-targeted proteins that shows the origin of the primary plastid in ‘Plantae’ from acyanobacterial source (blue circle), the secondary origin of the red algal plastid (red circle) inchromalveolates, and the independent origins of the green algal plastid (green circles) in euglenids, andchlorarachniophytes (see text for details). These latter two groups are not part of the phylogeneticanalysis and have been simply added to the tree.

analyses of nuclear-encoded plastid-targetedproteins that supports the monophyly of chro-malveolate plastids is shown in Figure 4b.The separate origins of the chlorarachnio-phyte and euglenid green plastids that was in-ferred from analysis of plastid genomes fromthese taxa (85) have been added to this tree.The potential power offered by phylogeneticsis exemplified by Figure 4b in which we cantrace in one framework the origin of prokary-otic genes in eukaryotic nuclear genomes viaprimary endosymbiosis (filled blue circle) andthe subsequent transfer of these genes fromone or more red algae to the chromalveolatesvia secondary endosymbiosis (filled red circle).This type of analysis has also provided directevidence for tertiary endosymbiosis in which

an alga containing a secondary plastid was it-self engulfed and retained by another protist(13, 40, 69). Although not discussed in detailhere, this phenomenon is until now limitedto dinoflagellates that are the masters of serialendosymbiosis (31).

Case Study: The Peculiar Path ofDinoflagellate Peridinin PlastidEvolutionThe most common type of plastid in di-noflagellates contains peridinin as the majorcarotenoid. This pigment, although similarin structure to fucoxanthin, is unique tothis group. Three membranes surround theperidinin-containing plastid, which is not

156 Reyes-Prieto ·Weber · Bhattacharya

An

nu

. R

ev.

Gen

et.

20

07

.41

. D

ow

nlo

aded

fro

m a

rjo

urn

als.

ann

ual

rev

iew

s.o

rgb

y U

NIV

ER

SIT

E D

E L

IEG

E o

n 0

7/0

3/0

7.

Fo

r p

erso

nal

use

on

ly.

Reyes-Prieto et al. (2007) Annu Rev Genet 41:147-168

Page 32: Denis BAURAIN

a distantly related photosynthetic eukaryote whose plastid

evolved directly from the cyanobacterial plastid progenitor.

Inferring how many times the ‘primary’ plastids of red algae,

green algae (and plants) and glaucophyte algae evolved into

‘secondary’ plastids is an area of active investigation and

debate.(22–25) No secondary plastids derived from glauco-

phytes are known, but both green and red algae have, each at

the very least on one occasion, been captured and converted

into a secondary plastid (Fig. 2). This process involves a

second round of EGT, this time from the endosymbiont

nucleus to that of the secondary host (Fig. 1), as well as the

evolution of another protein import pathway built on top of that

used by primary plastids.(5,26) For these reasons, successful

integration of a secondary endosymbiont is thought by many

to be difficult to achieve, and secondary endosymbiosis is

thus usually invoked only sparingly.

Secondary plastids of green algal origin occur in

euglenophytes and chlorarachniophytes, whereas most

plastids in so-called ‘chromalveolates’ are derived from red

algae. Chromalveolates include cryptophytes, haptophytes,

dinoflagellates, apicomplexans, the newly discovered coral

alga Chromera velia and stramenopiles (or heterokonts), the

latter being the group to which diatoms belong (Fig. 2). The

chromalveolate hypothesis was put forth as an attempt to

N2

N1

HGT

M2

M1

CB

HGT HGT

Secondary host

Plastidprogenitor

Primary host

EGTEGT

Figure 1. Endosymbiosis and gene flow in photosynthetic eukar-yotes. Diagram depicts movement of genes in the context of primaryand secondary endosymbiosis, beginning with the cyanobacterialendosymbiont (CB) that gave rise to modern-day plastids. Acquisitionof genes by horizontal (or lateral) gene transfer is always a possibility,and such genes can be difficult to distinguish from those acquired byendosymbiotic gene transfer (EGT). CB, cyanobacterium; HGT, hor-izontal gene transfer; MT, mitochondrion.

Archaeplastida

Apicomplexans +/-Chromera

Ciliates

Rhizaria

ForaminiferaOther Rhizaria

Other heterotrophs

Stramenopiles

Alveolates

GoniomonasKatablepharidsTelonemids

'Chromalveolates'

Perkinsids (?)

SAR

Centrohelids

Oomycetes

DiatomsOther phototrophs

Haptophytes

Cryptophytes

(Pico)biliphytes (?)

Chlorarachniophytes

Hacrobia

2o

Amoebozoa

Lobose amoebaeSlime mouldsIncludes

Excavata

Kinetoplastids

Heterolobosea

Euglenophytes

ParabasalidsDiplomonads

Jakobids

2o

Includes

OpisthokontsAnimals

Fungi

IncludesChoanoflagellates

3o

Green algae + plants

Red algaeGlaucophytes

1o

2o

HGT

HGT HGT

HGT

HGT

2oApusomonads,Other lineages

3o

Dinoflagellates +/- 3o

Figure 2. Origin and spread of photosynthesis across the eukaryotic tree of life. Diagram shows hypothesised ‘supergroups’ with emphasis onthose containing photosynthetic lineages. Some (but not all) lineages within the different supergroups are provided for context. The treetopology shown within the ‘chromalveolate’!SAR (Stramenopiles!Alveolates!Rhizaria) clade represents a synthesis of phylogenetic andphylogenomic data published as of 31 August 2009. Branch lengths do not correspond to evolutionary distance. Dashed lines indicateuncertainties with respect to the timing and/or directionality of secondary (28) or tertiary (38) endosymbiotic events, question marks (?) indicateuncertainty as to the presence of a plastid and ‘!/"’ indicates that both plastid-bearing (!) and plastid-lacking (") dinoflagellates andapicomplexans exist. Examples of green and red algal-derived tertiary plastids in dinoflagellates are known (see text for discussion). HGT,horizontal (or lateral) gene transfer.

What the papers say M. Elias and J. M. Archibald

1274 BioEssays 31:1273–1279, ! 2009 Wiley Periodicals, Inc.

Elias and Archibald (2009) BioEssays 31:1273-1279

Page 33: Denis BAURAIN

4D’où viennent les Eucaryotes ?

Page 34: Denis BAURAIN

Eukaryotic Origins

http://reflexions.ulg.ac.be/

3 main scenarios

Page 35: Denis BAURAIN

Chimeric Eukaryotes (1) c cyanobactérie (eubact érie) p protéobactérie (eubact érie) m méthanogène (archébactérie) y levure (Eucaryote)

gènes informationnelsgènes opérationnels

628 gènes 513 gènes

628 gènes

Rivera et al. (1998) PNAS USA 95:6239-6244; Poptsova & Gogarten (2007) BMC Bioinformatics 8:120

BLAST BRH(best reciprocal hits)

James Lake

Page 36: Denis BAURAIN

© 2006 Nature Publishing Group

called into question, as some phylogenies have suggested thatprokaryotes might be derived from eukaryotes55. However, theubiquity of mitochondrial homologues represents a strong argumentthat clearly polarizes the prokaryote-to-eukaryote transition: becausethe common ancestor of contemporary eukaryotes contained amitochondrial endosymbiont that originated from within theproteobacterial lineage, we can confidently infer that prokaryotesarose and diversified before contemporary eukaryotes—the onlyones whose origin requires explanation—did. This view is consistentwith microfossil and biogeochemical evidence56.Current ideas on the origin of eukaryotes fall into two general

classes: those that derive a nucleus-bearing but amitochondriate cellfirst, followed by the origin of mitochondria in a eukaryotic host57–61

(Fig. 4a–d), and those that derive the origin of mitochondria in aprokaryotic host, followed by the origin of eukaryotic-specificfeatures62–64 (Fig. 4e–g). Models that derive a nucleated but amito-chondriate cell as an intermediate (Fig. 4a–d) have suffered asubstantial blow with the demise of Archezoa. Models that do notentail amitochondriate intermediates have in common that the hostassumed to have acquired the mitochondrion was an archaebacter-ium not a eukaryote; hence, the steep organizational grade betweenprokaryotes and eukaryotes follows in the wake of radical chimaer-

ism involvingmitochondrial origins (Fig. 4e–g). A criticism facing all‘archaebacterial host’ models is that phagotrophy (the ability toengulf bacteria as food particles) was once seen as an absoluteprerequisite for mitochondrial origins60. This argument has lostsome of its strength with the discovery of symbioses where oneprokaryote lives inside another, non-phagocytotic prokaryote65.

The elusive informational ancestorWith the exception of the neomuran hypothesis, which views botheukaryotes and archaebacteria as descendants of Gram-positiveeubacteria60,61 (Fig. 4d), most current theories for eukaryotic origins(Fig. 4) posit the involvement of an archaebacterium in that process.The archaebacterial link to eukaryote origins was first inferredfrom shared immunological and biochemical similarities of theirDNA-dependent RNA polymerases66. Tree-based studies of entiregenomes67,68 extended this observation: most eukaryotic genes forreplication, transcription and translation (informational genes) arerelated to archaebacterial homologues, while those encoding biosyn-thetic and metabolism functions (operational genes) are usuallyrelated to eubacterial homologues8,67,68.The rooted SSU rRNA tree1 depicts eukaryotes and archaebacteria

as sister groups, as in the neomuran (Fig. 4d) hypothesis60,61. By

Figure 4 |Models for eukaryote origins that are, in principle, testable withgenome data. a–d, Models that propose the origin of a nucleus-bearingbut amitochondriate cell first, followed by the acquisition of mitochondriain a eukaryotic host. e–g, Models that propose the origin of mitochondria ina prokaryotic host, followed by the acquisition of eukaryotic-specific

features. Panels a–g are redrawn from refs 57 (a), 58 (b), 59 (c), 60 and 61(d), 62 (e), 63 (f) and 64 (g). The relevant microbial players in each modelare labelled. Archaebacterial and eubacterial lipid membranes are indicatedin red and blue, respectively.

NATURE|Vol 440|30 March 2006 REVIEWS

627

Embley & Martin (2006) Nature 440:623-630

Chimeric Eukaryotes (2)

‘Archezoa’

‘archébactérie’à mitochondrie

nombreux scénarios de fusionentre une archébactérie et

une ou plusieurs eubactérie(s)

Martin Embley

Page 37: Denis BAURAIN

Eukaryotic Signature Proteins

mitochondria descended from the ancestralbacterium (17, 18, 36, 47). Several hundredgenes have been transferred from the ancestralbacterium to the nuclear genome, but mostproteins from the original endosymbiont havebeen lost. For yeast, the largest protein classcontains more than 200 eukaryote proteins(ESPs) targeted to the mitochondrion but en-coded in the nucleus. In addition, the yeastnucleus encodes 150 mitochondrial proteins notuniquely identifiable with a single domain butapparently eukaryotic descendants from the com-mon ancestor. Accordingly, the yeast and humanmitochondria proteomes emerge largely asproducts of the eukaryotic nuclear genome(85%) and only to a lesser degree (15%) as directdescendants of endosymbionts (17, 18, 36, 45).The strong representationof ESPs in their prote-omesmeans thatmitochon-dria and their descendantsareusefullyviewedas‘‘hon-orary’’ CSSs.

There are substantialnumbers of ESPs in theother CSSs. For the pro-teome of the reduced an-aerobic parasite Giardialamblia (23), searches of2136 proteins found ineach of Saccharomycescerevisiae, Drosophilamelanogaster,Caenorhab-ditis elegans, and Arabi-dopsis thaliana yielded347 ESPs for G. lamblia.Thiswas reduced to rough-ly 300 by rigorous screen-ing, with ESPs distributedbetween nuclear and cy-toplasmic compartments(Fig. 2) (48). The ubiquityof the ESPs and the ab-sence of archaeal de-scendants are not easilyexplained by a prokary-ote genome fusion model(49). The simplest inter-pretation is that the host for the endosymbiont/mitochondrial lineage was an ancestral eukaryote.

Similar results are obtained for anotherreduced eukaryote, the intracellular parasiteEncephalitozoon cuniculi. A recent study (24)identified 401 ESPs, of which 295 had homo-logs among the ESPs of G. lamblia (23). Twomajor categories of ESPs in the G. lamblia andE. cuniculi genomes were distinguished: thoseassociated with the CSSs (Fig. 2) and thoseinvolved in control functions such as guanosinetriphosphate (GTP) binding proteins, kinases,and phosphatases (7). It was also observed (23)that many characteristic eukaryotic proteins withweak sequence homology to prokaryotic proteinsbut more convincing homologies of structuralfold such as the actins, tubulins, kinesins,

ubiquitins, and some GTP binding proteins areamong the most highly conserved eukaryoticproteins. These may be descendants of the com-mon ancestor recruited early in the evolution ofthe eukaryotic nuclear genome.

Nucleolar proteomes (20, 21) are examplesof essential eukaryote compartments not wrappedin double membranes and where there is nosuspicion of an endosymbiotic origin. From 271proteins in the human nucleolar proteome, 206protein folds were identified and classified phy-logenetically (20, 21). Of these, 109 are eukary-otic signature folds, and the remaining onesappear to be descendants of the common an-cestor, occurring in two or three domains.

The spliceosome is a unique molecularmachine that removes introns from eukaryote

mRNAs (22). Even though we do not know theancestral processing signals for the earliesteukaryotes (50), roughly half of the 78 spliceo-somal proteins likely to be present in the an-cestral spliceosome are ESPs, (13) whereasthe other half containing the Sm/LSm proteins(51) have homologs in bacteria and archaea(13). These distributions of both ESPs as wellas of putative descendants of the common an-cestor suggest that many components of mod-ern spliceosomes were present in the commonancestor (52).

The subdivision into subcellular compart-ments (CSSs) with characteristic proteomes re-stricts proteins to volumes considerably smallerthan the whole cell. Concentrations of macro-molecules in cells are very high, typically be-

tween 20 and 30% of weight or volume (53).Such densities are described as ‘‘molecularcrowding’’ because the space between macro-molecules is much less than their diameters;consequently, diffusion of proteins in cells isretarded (54). Molecular crowding favors mac-romolecular associations, large complexes, andnetworks of proteins that support biologicalfunctions (16, 17, 53).

High densities enhance the association ki-netics of small molecules with proteins becausethe excluded volumes of the proteins reduce theeffective volume through which small moleculesdiffuse (55). The sum of these effects is that thehigh macromolecular densities within CSSs en-hance the kinetic efficiencies of proteins. Thesame principles apply to the smaller prokary-

otic cells, but the effectsare accentuated in largercells. Subdividing highdensities of proteins in-to more or less distinctcompartments contain-ing functionally interac-tive macromolecules isexpected to be an earlyfeature of the eukary-ote lineage. The distinc-tive proteome of nucleolidemonstrates that com-partmentation does notrequire an enclosing mem-brane. Furthermore, cellfusion is not required toaccount for, nor does itexplain (49), the largenumber of eukaryote cellcompartments.

Selection Gives andSelection Takes

Genomes evolve continu-ously through the interplayof unceasing mutation,unremitting competition,and ever-changing envi-ronments. Both sequenceloss and sequence gain

can result. In general, expanded genome size,along with augmented gene expression, increasesthe costs of cell propagation so the evolution oflarger genomes and larger cells requires gains infitness that compensate (15, 56, 57). Conversely,genome reduction is expected to lower the costsof propagation. There is an ever-present poten-tial to improve the efficiency of cell propaga-tion by reductive evolution.

Environmental shifts may neutralize se-quences, leaving no selective pressure to main-tain them against the persistent flux of deleteriousmutations. Such neutralized sequences eventu-ally and inevitably disappear because of ‘‘mu-tational meltdown’’ (14, 15, 56, 57). Genomereduction can be achieved through differentialloss of coding and noncoding sequences (com-

Fig. 2. Distribution of ESPs in the proteome of G. lamblia. ESPs (23) were matched to the humanInternational Protein Index data set (48) and then assigned to individual CSSs based on their geneontology annotations. A protein may be present in more than one CSS (e.g., a protein involved intransport from the nucleus to the cytoplasm will be assigned to both CSSs). Black numbers are thenumber of proteins assigned to each CSS from the total G. lamblia proteome (AACB00000000)(3077 ORFs matched and linked to gene ontology); red numbers are the ESPs assigned to each CSS(320 proteins matched and linked to gene ontology).

REVIEW

19 MAY 2006 VOL 312 SCIENCE www.sciencemag.org1012

on M

arc

h 1

2,

20

07

w

ww

.scie

ncem

ag.o

rgD

ow

nlo

ade

d fro

m

Kurland et al. (2006) Science 312:1011-1014

347 ESPs / 2136

Giardia lamblia

1. inventées chez les Eucaryotes2. très modifiées chez les Eucaryotes3. perdues chez les Procaryotes

Charles Kurland

Page 38: Denis BAURAIN

Life Roots

Delsuc et al. (2005) Nat Rev Genet 6:361-375; Forterre & Philippe (1999) BioEssays 21:871-879 Rivera & Lake (2004) Nature 431:152-155

Patrick Forterre Hervé Philippe

hypotheses, the derived domain corresponds to the eukary-

otes that originated from a merging of two primitive prokary-

otic lineages, but the authors disagree on the nature of these

lineages (see legend of Fig. 1).

Alternatively, others have favoured the view that the rRNA

tree rooted in the bacterial branch is the good one, and that

most, if not all, contradictions observed in other phylogenies

should be explained by lateral gene transfers (LGT).(8,21) For

these authors, the choice of the ‘‘correct tree’’ is based on the

assumption that proteins involved in translation, transcription,

and replication (informational proteins, sensu, Rivera et

al.,(14)) which are often more similar in Archaea and Eucarya,

are especially informative in inferring phylogenies because

they are less frequently involved in LGT. Moreover, Brown

and Doolittle(8) argued that these proteins have evolved at a

similar rate in the three domains, through the application of

the relative rate test to a large data set(8) (see Fig. 2A for

explanation).

The common assumption of most people who support one

of the two above hypotheses (fusion or LGT) is that trees

inferred from molecular phylogenetic analyses are basically

correct (and thus require biological explanations). We argue

here that this is not a correct assumption and that many

molecular phylogenies, in particular those rooting the tree of

life, do not reflect the genuine evolutionary history (see for

review (22)). Recent studies in eukaryotic evolution support

this view and suggest an explanation for some artefactual

results that are strongly supported by classical tree reconstruc-

tion methods.(23–26) In addition, we argue that most authors

focus too exclusively on LGT or cell fusion as a possible

source of mosaicism between the three domains (often

ignoring that mosaicism is a general rule in evolution) and do

not pay enough attention to other major evolutionary forces,

such as differential gene loss and non-orthologous replace-

ment. Finally we think that most current scenarios about early

evolution are highly biased by prejudicial view of prokaryotes

as primitive organisms and we consider alternative hypoth-

Figure 1. Current views of the universal tree. a: The

traditional tree with the root in the bacterial branch (in green)

and the sisterhood of archaea and eucarya.(1) This tree mainly

corresponds to the rRNA tree, the rooting being based on

duplicated genes. b: Two universal trees of paralogous

proteins are used to root each other.(3,4) The root of the tree of

life cannot be inferred from a single gene, because molecular

phylogeny only produced unrooted trees. As proposed by

Schwartz and Dayhoff,(78) when a gene duplication has oc-

curred in an earliest common ancestor and has been followed

by divergent evolution in eukaryotes and prokaryotes, it is

possible to reciprocally root trees of paralogous genes.

c: Recent hypotheses suggest instead the emergence of

eukaryotes from the merging of ancient bacteria and archaea.

In some scenarios, the archaeal endosymbiont that gives rise

to the eukaryotic nucleus was a crenoarchaeota,(63) in other

cases a methanogenic euryarchaeota.(19,20) Gupta(9) suggests

a first split between Gram positive and Gram negative bacteria

(instead of Bacteria and Archaea) followed by the emergence

of Archaea from Gram positive bacteria.

Figure 2. Effect of mutational saturation on the relative rate

test and the long branch attraction artefact. A: The relative

rate test is performed to know whether speciesA and B evolve

at the same rate. To do that, one tests if d(A,O) ! d(B,O).(79)

This test is efficient when the gene under study is not

mutationally saturated (! and"). In contrast, two species that

have evolved at different rates (# and $, upper right tree) will

be also separated from an outgroup by the same distance if

the sequences are saturated by multiple substitutions, be-

cause in that case, the observed differences between these

two proteins are not proportional to the real number of

substitutions (plateau of the curve). When performed on such

a protein, the relative rate test will give the misleading result

that the two species have evolved at the same rate. Obviously,

the relative rate test is improved by the use of methods that

correct hidden multiple substitutions, but these methods are

hindered by the poor fit of the model of sequence evolution

(see(22) for a more complete discussion). B,C: Left trees aresupposed to be real ones and right trees are those incorrectly

recovered by classical methods of molecular phylogeny be-

cause of the long branch attraction artefact.

Problems and paradigms

872 BioEssays 21.10

hypotheses, the derived domain corresponds to the eukary-

otes that originated from a merging of two primitive prokary-

otic lineages, but the authors disagree on the nature of these

lineages (see legend of Fig. 1).

Alternatively, others have favoured the view that the rRNA

tree rooted in the bacterial branch is the good one, and that

most, if not all, contradictions observed in other phylogenies

should be explained by lateral gene transfers (LGT).(8,21) For

these authors, the choice of the ‘‘correct tree’’ is based on the

assumption that proteins involved in translation, transcription,

and replication (informational proteins, sensu, Rivera et

al.,(14)) which are often more similar in Archaea and Eucarya,

are especially informative in inferring phylogenies because

they are less frequently involved in LGT. Moreover, Brown

and Doolittle(8) argued that these proteins have evolved at a

similar rate in the three domains, through the application of

the relative rate test to a large data set(8) (see Fig. 2A for

explanation).

The common assumption of most people who support one

of the two above hypotheses (fusion or LGT) is that trees

inferred from molecular phylogenetic analyses are basically

correct (and thus require biological explanations). We argue

here that this is not a correct assumption and that many

molecular phylogenies, in particular those rooting the tree of

life, do not reflect the genuine evolutionary history (see for

review (22)). Recent studies in eukaryotic evolution support

this view and suggest an explanation for some artefactual

results that are strongly supported by classical tree reconstruc-

tion methods.(23–26) In addition, we argue that most authors

focus too exclusively on LGT or cell fusion as a possible

source of mosaicism between the three domains (often

ignoring that mosaicism is a general rule in evolution) and do

not pay enough attention to other major evolutionary forces,

such as differential gene loss and non-orthologous replace-

ment. Finally we think that most current scenarios about early

evolution are highly biased by prejudicial view of prokaryotes

as primitive organisms and we consider alternative hypoth-

Figure 1. Current views of the universal tree. a: The

traditional tree with the root in the bacterial branch (in green)

and the sisterhood of archaea and eucarya.(1) This tree mainly

corresponds to the rRNA tree, the rooting being based on

duplicated genes. b: Two universal trees of paralogous

proteins are used to root each other.(3,4) The root of the tree of

life cannot be inferred from a single gene, because molecular

phylogeny only produced unrooted trees. As proposed by

Schwartz and Dayhoff,(78) when a gene duplication has oc-

curred in an earliest common ancestor and has been followed

by divergent evolution in eukaryotes and prokaryotes, it is

possible to reciprocally root trees of paralogous genes.

c: Recent hypotheses suggest instead the emergence of

eukaryotes from the merging of ancient bacteria and archaea.

In some scenarios, the archaeal endosymbiont that gives rise

to the eukaryotic nucleus was a crenoarchaeota,(63) in other

cases a methanogenic euryarchaeota.(19,20) Gupta(9) suggests

a first split between Gram positive and Gram negative bacteria

(instead of Bacteria and Archaea) followed by the emergence

of Archaea from Gram positive bacteria.

Figure 2. Effect of mutational saturation on the relative rate

test and the long branch attraction artefact. A: The relative

rate test is performed to know whether speciesA and B evolve

at the same rate. To do that, one tests if d(A,O) ! d(B,O).(79)

This test is efficient when the gene under study is not

mutationally saturated (! and"). In contrast, two species that

have evolved at different rates (# and $, upper right tree) will

be also separated from an outgroup by the same distance if

the sequences are saturated by multiple substitutions, be-

cause in that case, the observed differences between these

two proteins are not proportional to the real number of

substitutions (plateau of the curve). When performed on such

a protein, the relative rate test will give the misleading result

that the two species have evolved at the same rate. Obviously,

the relative rate test is improved by the use of methods that

correct hidden multiple substitutions, but these methods are

hindered by the poor fit of the model of sequence evolution

(see(22) for a more complete discussion). B,C: Left trees aresupposed to be real ones and right trees are those incorrectly

recovered by classical methods of molecular phylogeny be-

cause of the long branch attraction artefact.

Problems and paradigms

872 BioEssays 21.10

© 2005 Nature Publishing Group

3 6 6 | MAY 2005 | VOLUME 6 www.nature.com/reviews/genetics

R E V I EW S

principal groups have been identified among the ordersof placental mammals, and their origins might beexplained by geographical isolation, resulting from platetectonics, during the early stages of diversificationamong placental mammals85.Molecular studies havealso revealed the prevalence of morphological conver-gence during the evolution of placental mammals,with the occurrence of parallel adaptive radiations14.This might partly explain why reconstructing the phy-logeny of placental orders on the basis of morphologicalcharacters has been difficult.

Similarly, studies of multiple genes from all threecompartments of the plant cell (the mitochondrion,chloroplast and nucleus) have helped to overcome long-standing uncertainties in land-plant phylogeny11,88–92.These advances have brought several changes to the traditional botanical classification of flowering plants,which were previously based on morphological featuressuch as floral characteristics and leaf shape93. In thiscase, as with the placental mammals, molecular evi-dence revealed the plasticity of phenotypic characters,highlighting several examples of previously unrecog-nized convergent evolution. This is perhaps best illus-trated by the case of the sacred lotus (Nelumbo nucifera),which was previously thought to be related to waterlilies (Nymphaeaceae), whereas molecular studiesunveiled a close phylogenetic affinity with plane trees(genus Platanus)88.

At a larger scale, sequence-based phylogenomic stud-ies of eukaryotic phylogeny confirmed the MONOPHYLY ofmost phyla, which were originally defined on the basisof ultrastructural or rRNA analyses.However, they alsodemonstrated the common origin of a group of mor-phologically diverse amoebae, which were previouslythought to have evolved independently17.Phylogenomicanalyses of nuclear20,94 and mitochondrial genes95 havealso corroborated the long-suspected sister-group rela-tionship between the unicellular choanoflagellates andmulticellular animals.However, the lack of representa-tives from several principal lineages (that is, Rhizaria,Cryptophytes, Haptophytes and Jakobids) in phyloge-nomic studies currently prevents the study of most of theworking hypotheses derived from decades of ultrastuc-tural and molecular studies. These hypotheses have pro-posed the division of the eukaryotic world into six maingroups80,96: the Opisthokonta, Amoebozoa, Plantae,Chromalveolata, Rhizaria, and Excavata (FIG. 1). Thevalidity of these proposed ‘kingdoms’ represents oneof the most important outstanding questions thatphylogenomics has the potential to answer.

Finally, the question of the origin of the eukaryoteshas been recently addressed97 using a gene-contentmethod that allows the modelling of genome-fusionevents60. The authors proposed that the eukaryotes orig-inated from a fusion event between a bacterial speciesand an archaeal species, leading to a ring-like structureat the root of the tree of life97. This would account forthe chimeric nature of the eukaryotes, which has beeninferred from the observation that many eukaryoticmetabolic genes have a greater number of similarcounterparts in Bacteria, whereas most of those

phylogenomic approach to shed light on long-standingphylogenetic questions, spanning all levels of the tree oflife (FIG. 1). In this section, we present recent advances foreach of the three domains of life that have been enabledby phylogenomics.

Eukaryotes.The reconstructions of phylogenies of pla-cental mammals and land plants represent the mostspectacular examples of recent advances enabled byphylogenomics. The evolutionary history of placentalmammals was traditionally considered to be irresolv-able, owing to the explosive radiation of species thatoccurred in a short space of time.However, this has nowbeen elucidated by the analysis of supermatrices thatcontain about 20 nuclear genes14,15,85,86, and the resultingphylogeny has been confirmed by recent analyses ofcomplete mitochondrial genomes30,87, with only a fewnodes left unresolved. All but 1 of the 18 morphologi-cally defined extant mammalian orders have been con-firmed by molecular studies. The notable exception tothis is the insectivores, which have been split into twodistinct orders by the recognition of an unexpectedgroup of African origin named Afrotheria14,15,77. Four

MONOPHYLY

Monophyletic taxa include allthe species that are derived froma single common ancestor.

Proteobacteria

Crenarchaeota

DesulfurococcalesThermo-proteales

Low-GC, Gram positive

Cyanobacteria

High-GC,Gram positive

Thermotogales

Aquificales

Chlamydiales

Plancto-mycetales

Deinococcales

Spirochaetes

Metazoans

Land plants

Green algaeRed algae

Glaucophytes

MycetozoansPelobionts

Entamoebae

RadiolariaCercozoa

Alveolates

StramenopilesHaptophytes

Crypto-phytes

Euglenoids

DiplomonadsJakobids

Amoebozoa

ExcavataChromalveolata

Rhizaria

Plantae

Euryarchaeota

Methanococcales

Methanosarcinales

Halobacteriales

Archaeoglobales

Thermoplasmatales

Sulfolobales

ThermococcalesMitochondria

ChloroplastsEukaryota

ArchaeaBacteria

Choanoflagellates

Fungi

Opisthokonta

Figure 1 | Phylogenomics and the tree of life. A schematic representation showing recent

advances and future challenges of the phylogenomic approach for resolving the main branches

of the tree of life. This tree aims to represent a consensus view on evolutionary relationships within

the three domains — Bacteria, Archaea, and Eukaryota — with hypothetical relationships

indicated as dashed lines. The main branches that have been identified (purple) or confirmed

(yellow) by phylogenomics are indicated. Blue dashed lines underline putative phylogenetic

hypotheses that have been indicated by phylogenomic studies and need further investigation.

The main uncertainties for which the phylogenomic approach might provide future answers are

pinpointed by red circles. Note that most of the progress brought about by the phylogenomic

approach has been realized at a smaller taxonomic scale for land plants, and for placental

mammals within the metazoans (see main text). The two well-recognized endosymbiotic events

involving bacteria that gave rise to eukaryotic organelles (mitochondria and chloroplasts) are

indicated by arrows (blue and green, respectively). Note however, that other horizontal gene

transfers and gene duplication events are not represented in this organismal tree, although they

do constitute important aspects of genome evolution.

Tree of Life

hypotheses, the derived domain corresponds to the eukary-

otes that originated from a merging of two primitive prokary-

otic lineages, but the authors disagree on the nature of these

lineages (see legend of Fig. 1).

Alternatively, others have favoured the view that the rRNA

tree rooted in the bacterial branch is the good one, and that

most, if not all, contradictions observed in other phylogenies

should be explained by lateral gene transfers (LGT).(8,21) For

these authors, the choice of the ‘‘correct tree’’ is based on the

assumption that proteins involved in translation, transcription,

and replication (informational proteins, sensu, Rivera et

al.,(14)) which are often more similar in Archaea and Eucarya,

are especially informative in inferring phylogenies because

they are less frequently involved in LGT. Moreover, Brown

and Doolittle(8) argued that these proteins have evolved at a

similar rate in the three domains, through the application of

the relative rate test to a large data set(8) (see Fig. 2A for

explanation).

The common assumption of most people who support one

of the two above hypotheses (fusion or LGT) is that trees

inferred from molecular phylogenetic analyses are basically

correct (and thus require biological explanations). We argue

here that this is not a correct assumption and that many

molecular phylogenies, in particular those rooting the tree of

life, do not reflect the genuine evolutionary history (see for

review (22)). Recent studies in eukaryotic evolution support

this view and suggest an explanation for some artefactual

results that are strongly supported by classical tree reconstruc-

tion methods.(23–26) In addition, we argue that most authors

focus too exclusively on LGT or cell fusion as a possible

source of mosaicism between the three domains (often

ignoring that mosaicism is a general rule in evolution) and do

not pay enough attention to other major evolutionary forces,

such as differential gene loss and non-orthologous replace-

ment. Finally we think that most current scenarios about early

evolution are highly biased by prejudicial view of prokaryotes

as primitive organisms and we consider alternative hypoth-

Figure 1. Current views of the universal tree. a: The

traditional tree with the root in the bacterial branch (in green)

and the sisterhood of archaea and eucarya.(1) This tree mainly

corresponds to the rRNA tree, the rooting being based on

duplicated genes. b: Two universal trees of paralogous

proteins are used to root each other.(3,4) The root of the tree of

life cannot be inferred from a single gene, because molecular

phylogeny only produced unrooted trees. As proposed by

Schwartz and Dayhoff,(78) when a gene duplication has oc-

curred in an earliest common ancestor and has been followed

by divergent evolution in eukaryotes and prokaryotes, it is

possible to reciprocally root trees of paralogous genes.

c: Recent hypotheses suggest instead the emergence of

eukaryotes from the merging of ancient bacteria and archaea.

In some scenarios, the archaeal endosymbiont that gives rise

to the eukaryotic nucleus was a crenoarchaeota,(63) in other

cases a methanogenic euryarchaeota.(19,20) Gupta(9) suggests

a first split between Gram positive and Gram negative bacteria

(instead of Bacteria and Archaea) followed by the emergence

of Archaea from Gram positive bacteria.

Figure 2. Effect of mutational saturation on the relative rate

test and the long branch attraction artefact. A: The relative

rate test is performed to know whether speciesA and B evolve

at the same rate. To do that, one tests if d(A,O) ! d(B,O).(79)

This test is efficient when the gene under study is not

mutationally saturated (! and"). In contrast, two species that

have evolved at different rates (# and $, upper right tree) will

be also separated from an outgroup by the same distance if

the sequences are saturated by multiple substitutions, be-

cause in that case, the observed differences between these

two proteins are not proportional to the real number of

substitutions (plateau of the curve). When performed on such

a protein, the relative rate test will give the misleading result

that the two species have evolved at the same rate. Obviously,

the relative rate test is improved by the use of methods that

correct hidden multiple substitutions, but these methods are

hindered by the poor fit of the model of sequence evolution

(see(22) for a more complete discussion). B,C: Left trees aresupposed to be real ones and right trees are those incorrectly

recovered by classical methods of molecular phylogeny be-

cause of the long branch attraction artefact.

Problems and paradigms

872 BioEssays 21.10

racine eubactérienne artéfactuelle ?

protéines dupliquéesSSU rRNA

cumulative probabilities of these five trees are shown at the right ofeach tree. The upper tree is supported by 60.5% of the conditionedreconstruction bootstraps, the second tree by 16.8% of the con-ditioned reconstruction bootstraps, for a cumulative probability of77.3%, and so on. It initially appears that the resolution of the tree ispoor because themost probable tree is supported by a low bootstrapvalue, and the other trees are supported by even lower values.However, when the fivemost probable unrooted trees are aligned byshifting each to the left or the right, until their leaves match, theyform a repeating pattern indicating that the five trees are simplypermutations of an underlying cyclic pattern. This implies that theyare derived from the single cycle graph (ring) shown at the bottomleft of Fig. 1 (ref. 20). When that ring is cut at any one of the fivecentral arcs and then unfolded, the resulting unrooted tree willcorrespond to one of the five most probable trees. In other words,the data are not tree-like they are ring-like.

A rigorous combinatorial analysis of the genomic fusion of twoorganisms valid for all possible initial pre-fusion trees has shownthat the conditioned reconstruction algorithm recovers all permu-tations of the cycle graph, and only those permutations20. Hence,these results may be interpreted, in a manner analogous to the

interpretation of restriction digests of a circular plasmid or themapping of a circular chromosome, as implying a ring of life. Thering shown at the lower left of Fig. 1 is completely consistent with allfive of the fully resolved trees shown at the top of the figure, hencewe refer to it as a fully resolved ring. That ring explains 96.3% of thebootstrap replicates, and the partially resolved ring at the lower rightof Fig. 1 explains 99.2% of the bootstrap replicates. The ring issupported for all possible conditioning genomes (see Supplemen-tary Data). This provides strong evidence for the completelyresolved ring on the left, and even stronger evidence for the less-resolved ring on the right. From these results we infer that theeukaryotic nuclear genome was formed from the fusion of thegenomes of a relative of a proteobacterium (Pg) and a relative of anarchaeal eocyte (E).

Eubacterial roots of the ring of lifeBy analysing additional taxa, one can discover whether the eubacter-ial fusion partner branches even deeper than the g-proteobacteriumXylella. The data set for this second analysis, described in the legendfor Fig. 2, specifically included the a-proteobacterium Brucella andthe cyanobacterium Synechocystis. The results are shown in Fig. 2.The cumulative probability of the five trees, shown at the top of thefigure, corresponds to 87.1% of the total bootstraps, indicatingsignificant statistical support for the ring shown at the lower leftof Fig. 2. The percentage increases to 92.8% when the a-proteo-bacterium and the g-proteobacterium are allowed to form anunresolved bifurcation, and rises to 99.7% when the a-proteobac-terium, g-proteobacterium and cyanobacterium are allowed toform an unresolved trifurcation. The ring is also supported for allpossible conditioning genomes (see Supplementary Data). Thecombined analyses in Figs 1 and 2 provide strong evidence for thering in Fig. 3, and indicate that the immediate relative of theeubacterial fusion partner was probably a primitive proteobacter-ium, or possibly a primitive photosynthetic eubacterium.The second analysis, illustrated in Fig. 2, did not detect an

additional ring connecting the a-Proteobacteria, the phylum fromwhich mitochondria arose27, to the eukaryotes. There is increasingevidence that eukaryotes have received a deluge of DNA fromorganelles28. In one study, 630 nuclear-encoded genes in eukaryotes

Figure 2 Eubacterial relationships within the ring of life. The genomes are from two yeasts

(Y1, S. pombe and Y2, S. cerevisiae), a g-proteobacterium (Pg, X. fastidiosa 9a5c), an

a-proteobacterium (Pa, Brucella melitensis 16M), a cyanobacterium (C, Synechocystis

sp. PCC6803), a halobacterium (H, Halobacterium sp. NRC-1), an eocyte (E, S. tokodaii)

and a bacillus (not shown; the conditioning genome S. aureus MW2). The five unrooted

trees consistent with the ring are shown with leaves pointing upward to emphasize that

each is part of a repeating pattern. Cumulative probabilities are at the right of each tree.

Fully and partially resolved rings are at the lower left, right and centre, respectively.

Figure 3 A schematic diagram of the ring of life. The eukaryotes plus the two eukaryotic

root organisms (the operational and informational ancestors) comprise the eukaryotic

realm (see Supplementary Discussion). Ancestors defining major groups in the prokaryotic

realm are indicated by small circles on the ring. The Archaea49, shown on the bottom right,

includes the Euryarchaea, the Eocyta and the informational eukaryotic ancestor. The

Karyota5, shown on the upper right of the ring, includes the Eocyta and the informational

eukaryotic ancestor. The upper left circle includes the Proteobacteria49 and the

operational eukaryotic ancestor. The most basal node on the left represents the

photosynthetic prokaryotes and the operational eukaryotic ancestor.

articles

NATURE |VOL 431 | 9 SEPTEMBER 2004 | www.nature.com/nature 153© 2004 Nature Publishing Group

‘Ring of Life’

Page 39: Denis BAURAIN

Secondary Simplification

Genomics and the Irreducible Natureof Eukaryote CellsC. G. Kurland,1 L. J. Collins,2 D. Penny2*

Large-scale comparative genomics in harness with proteomics has substantiated fundamentalfeatures of eukaryote cellular evolution. The evolutionary trajectory of modern eukaryotes isdistinct from that of prokaryotes. Data from many sources give no direct evidence that eukaryotesevolved by genome fusion between archaea and bacteria. Comparative genomics shows that, undercertain ecological settings, sequence loss and cellular simplification are common modes ofevolution. Subcellular architecture of eukaryote cells is in part a physical-chemical consequence ofmolecular crowding; subcellular compartmentation with specialized proteomes is required for theefficient functioning of proteins.

Comparative genomics and proteomicshave strengthened the view that moderneukaryote and prokaryote cells have long

followed separate evolutionary trajectories. Be-cause their cells appear simpler, prokaryoteshave traditionally been considered ancestors ofeukaryotes (1–4). Nevertheless, comparativegenomics has confirmed a lesson from paleon-tology: Evolution does not proceed monoton-ically from the simpler to the more complex(5–9). Here, we review recent data from pro-teomics and genome sequences suggesting thateukaryotes are a unique primordial lineage.

Mitochondria, mitosomes, and hydrogeno-somes are a related family of organelles thatdistinguish eukaryotes from all prokaryotes(10). Recent analyses also suggest that earlyeukaryotes had many introns (11, 12), and RNAsand proteins found in modern spliceosomes(13). Indeed, it seems that life-history param-eters affect intron numbers (14, 15). In addition,Bmolecular crowding[ is now recognized as animportant physical-chemical factor contributingto the compartmentation of even the earliesteukaryote cells (16, 17).

Nuclei, nucleoli, Golgi apparatus, centrioles,and endoplasmic reticulum are examples ofcellular signature structures (CSSs) that dis-tinguish eukaryote cells from archaea and bacte-ria. Comparative genomics, aided by proteomicsof CSSs such as the mitochondria (18, 19),nucleoli (20, 21), and spliceosomes (13, 22),reveals hundreds of proteins with no orthologsevident in the genomes of prokaryotes; theseare the eukaryotic signature proteins (ESPs)(23, 24). The many ESPs within the subcel-lular structures of eukaryote cells providelandmarks to track the trajectory of eukary-ote genomes from their origins. In contrast,

hypotheses that attribute eukaryote origins togenome fusion between archaea and bacteria(25–30) are surprisingly uninformative aboutthe emergence of the cellular and genomic sig-natures of eukaryotes (CSSs and ESPs). Thefailure of genome fusion to directly explain anycharacteristic feature of the eukaryote cell is acritical starting point for studying eukaryoteorigins.

It is agreed that, whether using gene con-tent, protein-fold families, or RNA sequences(31–36), the unrooted tree of life divides intoarchaea, bacteria, and eukaryotes (Fig. 1). Onsuch unrooted trees, the three domains divergefrom a population that can be called the lastuniversal common ancestor (LUCA). How-ever, LUCA (37) means different things todifferent people, so we prefer to call it a com-mon ancestor; in this case it is the hypothetical

node at which the three domains coalesce inunrooted trees.

There are links between comparative ge-nomics and the ecology of organisms. Theseinclude the aerobic/anaerobic states of theenvironment and the adaptive fit of organellessuch as mitochondria, hydrogenosomes, andmitosomes (10, 18, 19, 38–41). In addition tothe advantages from oxidative metabolism and/or oxygen detoxification, other advantages musthave accrued from having a cellular compart-ment with dense proteomes (15, 38, 42). Eco-logical specialization can account for thedifferences between prokaryote and eukaryotecell architectures and genome sizes. Small pro-karyote cells with streamlined genomes mayreflect adaptation to rapid growth and/or mini-mal resource use by autotrophs, heterotrophs, andsaprotrophs. Divergent evolutionary paths mayemerge with the adoption of a phagotrophic-feeding mode in an ancestor of eukaryotes. Thisuniquely eukaryote feeding mode requires alarger and more complex cell, consistent withearlier suggestions that a unicellular raptor(predator), which acquired a bacterial endo-symbiont/mitochondria lineage, became thecommon ancestor of all modern eukaryotes(3, 4, 43). Indeed, predator/prey relationshipsmay provide the ecological setting for thedivergence of the distinctive cell types adoptedby eukaryotes, bacteria, and archaea.

Proteomics of Cell Compartments

Comparative genomics and proteomics revealphylogenetic relationships between proteinsmaking up eukaryote subcellular features andthose found in prokaryotes. We distinguish threemain phylogenetic classes; the first are proteinsthat are unique to eukaryotes: the ESPs. TheESPs we place in three subclasses: proteinsarising de novo in eukaryotes; proteins sodivergent to homologs of other domains thattheir relationship is largely lost; or finally,descendants of proteins that are lost from otherdomains, surviving only as ESPs in eukaryotes.

The second class contains interdomainhorizontal gene transfers; these are proteinsoccurring in two domains with the lineage ofone domain rooted within their homologs in asecond domain (44). The third class containshomologs found in at least two domains, butthe proteins of one domain are not rootedwithin another domain(s); instead, the homo-logs appear to descend from the common an-cestor (Fig. 1). Most eukaryote proteins sharedby prokaryotes are distant, rather than close,relatives. Thus, proteins shared between do-mains appear to be descendants of the commonancestor; few seem to result from interdomainlateral gene transfer (31–35).

Although the genomes of mitochondria areclearly descendants of a-proteobacteria (45, 46),proteomics and comparative genomics identifyrelatively few proteins in yeast and human

REVIEW

1Department of Microbial Ecology, Lund University, Lund,Sweden. 2Allan Wilson Center for Molecular Ecology andEvolution, Massey University, Palmerston North, NewZealand.

*To whom correspondence should be addressed. E-mail:[email protected]

Fig. 1. The common ancestor of eukaryotes,bacteria, and archaea may have been a communityof organisms containing the following: autotrophsthat produced organic compounds from CO2 eitherphotosynthetically or by inorganic chemical reac-tions; heterotrophs that obtained organics byleakage from other organisms; saprotrophs thatabsorbed nutrients from decaying organisms; andphagotrophs that were sufficiently complex toenvelop and digest prey. !M: endosymbiosis ofmitochondrial ancestor.

www.sciencemag.org SCIENCE VOL 312 19 MAY 2006 1011

on M

arc

h 1

2, 2

007

w

ww

.scie

nce

mag

.org

Dow

nlo

aded

fro

m

indel in the elongation factor EF1! present in eukaryotes and

crenotes.(63)

The phenotypic homogeneity of a group of organisms is

sometimes used as an argument to support their monophyly.

However, this can be misleading. For example, the emer-

gence of birds and mammals led to the appearance of three

phenotypically coherent groups: reptiles, birds, and mam-

mals; but only the latter two are monophyletic, reptiles being

paraphyletic! The phenotypic coherence of the three domains

at the molecular level does not in fact impose a strong

constraint on evolutionary scenarios except to assume two

(Fig. 4a) or three (Fig. 4b)major morphological and physiologi-

cal changes (dramatic evolutionary events), during which

fundamental molecular mechanisms evolved independently

from each others. In a scenario with three dramatic evolution-

ary events, the three groups are related by a trifurcation in an

unrooted tree (Fig. 4b), whilst in the case of two dramatic

evolutionary events, one of the three groups has an ‘‘interme-

diate’’ position between the two others, but this group can be

either mono, poly, or paraphyletic (Fig. 4). The possibility that

archaea have an intermediate position between the two other

domains is suggested by recent data from comparative

genomics. Indeed, one notices a reduction in the number of

proteins involved in transcription, translation, and replication

machineries, with a gradient from Eucarya to Bacteria, and

Archaea in-between. For example, the replication factor that

loads the DNA replicase onto the RNA primer is encoded by

five homologous genes in Eucarya, two in Archaea and one

in Bacteria.(49) The same trends can be recognised in the

number of RNA polymerase subunits and basal transcription

factors, of ribosomal proteins, and of translation initiation

factors. This suggests an evolutionary trend from eukaryotes

to bacteria or from bacteria to eukaryotes, with archaea in

between. A striking feature of this process is the huge gap in

complexity between archaea and eukaryotes, as exemplified

by the evolution of the transcription apparatus (Fig. 5),

Figure 5. Schematic evolution of the transcription appara-

tus. The number of different known subunits is indicated (this

number is an approximation for large eukaryotic complexes).

Double-arrows and question marks indicate that the direction

of evolution is a priori unknown. Homologous protein or

complex have the same colour.

Figure 6. Our proposal for the universal

tree of life. The root is located in the

eukaryotic branch(45) (see text). In all lin-

eages, evolution is supposed to pass

through stages of simplification (blue) and

complexification (red) in terms of gene

contents and variety of molecular mecha-

nisms. Simplification is prevalent in prokary-

otic lineages but with some exceptions.

The reverse is true in eukaryotes.

Problems and paradigms

876 BioEssays 21.10

indel in the elongation factor EF1! present in eukaryotes and

crenotes.(63)

The phenotypic homogeneity of a group of organisms is

sometimes used as an argument to support their monophyly.

However, this can be misleading. For example, the emer-

gence of birds and mammals led to the appearance of three

phenotypically coherent groups: reptiles, birds, and mam-

mals; but only the latter two are monophyletic, reptiles being

paraphyletic! The phenotypic coherence of the three domains

at the molecular level does not in fact impose a strong

constraint on evolutionary scenarios except to assume two

(Fig. 4a) or three (Fig. 4b)major morphological and physiologi-

cal changes (dramatic evolutionary events), during which

fundamental molecular mechanisms evolved independently

from each others. In a scenario with three dramatic evolution-

ary events, the three groups are related by a trifurcation in an

unrooted tree (Fig. 4b), whilst in the case of two dramatic

evolutionary events, one of the three groups has an ‘‘interme-

diate’’ position between the two others, but this group can be

either mono, poly, or paraphyletic (Fig. 4). The possibility that

archaea have an intermediate position between the two other

domains is suggested by recent data from comparative

genomics. Indeed, one notices a reduction in the number of

proteins involved in transcription, translation, and replication

machineries, with a gradient from Eucarya to Bacteria, and

Archaea in-between. For example, the replication factor that

loads the DNA replicase onto the RNA primer is encoded by

five homologous genes in Eucarya, two in Archaea and one

in Bacteria.(49) The same trends can be recognised in the

number of RNA polymerase subunits and basal transcription

factors, of ribosomal proteins, and of translation initiation

factors. This suggests an evolutionary trend from eukaryotes

to bacteria or from bacteria to eukaryotes, with archaea in

between. A striking feature of this process is the huge gap in

complexity between archaea and eukaryotes, as exemplified

by the evolution of the transcription apparatus (Fig. 5),

Figure 5. Schematic evolution of the transcription appara-

tus. The number of different known subunits is indicated (this

number is an approximation for large eukaryotic complexes).

Double-arrows and question marks indicate that the direction

of evolution is a priori unknown. Homologous protein or

complex have the same colour.

Figure 6. Our proposal for the universal

tree of life. The root is located in the

eukaryotic branch(45) (see text). In all lin-

eages, evolution is supposed to pass

through stages of simplification (blue) and

complexification (red) in terms of gene

contents and variety of molecular mecha-

nisms. Simplification is prevalent in prokary-

otic lineages but with some exceptions.

The reverse is true in eukaryotes.

Problems and paradigms

876 BioEssays 21.10

Forterre & Philippe (1999) BioEssays 21:871-879; Kurland et al. (2006) Science 312:1011-1014