Upload
alex-higgins
View
214
Download
1
Embed Size (px)
Citation preview
Brandi Cantarel, Bernard Henrissat, Pedro M. Coutinho
Architecture et Fonction des Macromolécules Biologiques Architecture et Fonction des Macromolécules Biologiques (UMR 6098)(UMR 6098)
CNRS / Aix-Marseille Université, FranceCNRS / Aix-Marseille Université, France
11stst Melampsora Melampsora Genome Consortium Workshop, Genome Consortium Workshop, Nancy (Aug/08)Nancy (Aug/08)
Brandi Cantarel, Bernard Henrissat, Pedro M. Coutinho
Architecture et Fonction des Macromolécules Biologiques Architecture et Fonction des Macromolécules Biologiques (UMR 6098)(UMR 6098)
CNRS / Aix-Marseille Université, FranceCNRS / Aix-Marseille Université, France
11stst Melampsora Melampsora Genome Consortium Workshop, Genome Consortium Workshop, Nancy (Aug/08)Nancy (Aug/08)
Carbohydrate-Active Enzymes in
Melampsora laricis-populina
CAZY Database and Website
Genome Annotation and Comparative
Genomics
Annotation Highlights from Melampsora
laricis-populina
Interpretation and Speculation
Outline
1991 1991 Glycoside Hydrolases (112)Glycoside Hydrolases (112)
• Glycosidases Glycosidases cleavecleave
• Transglycosidases Transglycosidases formform
1997 1997 Glycosyltransferases (91)Glycosyltransferases (91)
(NDP-, NMP-, lipid-phosphorylases(NDP-, NMP-, lipid-phosphorylases)) formform
1998 1998 Polysaccharide Lyases (19)Polysaccharide Lyases (19)cleavecleave
1999 1999 Carbohydrate Esterases (15)Carbohydrate Esterases (15)modifymodify
2000 2000 Carbohydrate-Binding Modules (52)Carbohydrate-Binding Modules (52)
Carbohydrate Active enZymes (CAZymes)
•Adhesion
•Recognition
•Selectivity
CAZyCAZy @ AFMB since September 1998@ AFMB since September 1998 www.cazy.org
© C
outi
nho &
Henri
ssat,
2
00
8
Name of protein OrganismEC number UniProtaccessions
GenBank accessions
PDBaccessions
Subfamily
742 genomes analyzed
(Aug 1st 2008)
662 Bacteria
52 Archaea
28 Eukaryotes
CAZY Database and WebsiteCAZY Database and Website
Genome Annotation and Comparative
Genomics
Annotation Highlights from Annotation Highlights from Melampsora Melampsora
laricis-populinalaricis-populina
Interpretation and SpeculationInterpretation and Speculation
Sequences/Structures: GenBank; UniProt; PDB
Genome Sequence
CAZy: Carbohydrate-Active EnZymes Database
www.cazy.org
CAZy: Carbohydrate-Active EnZymes Database
www.cazy.org
CAZy Sequences
Modular Annotation Specialized
Library of Modules
BLAST
HMMER
Family Annotation Mechanism; Structure; Function
Individual CAZyme
Annotation
Biochemical Data: Literature; PubMed; EMP; PMD; Other
Annotating CAZymes Function Prediction is a major bottleneck
Annotating CAZymes Function Prediction is a major bottleneck
• Common Genome Annotation Practices• Sequence Similarity ~ Specific Functional Prediction (≠)
• Erroneous annotation are propagated
• Original error(s) difficult to track
• Conservative Practices• Sequence Similarity = Family inclusion
• Catalytic machinery checked for borderline cases
• Functional assignment based on literature
• Prediction based on subfamily analysis
Annotation and ComparisonsAnnotation and Comparisons
CAZy - Biochemical Bioinformatics: Correlation of data w/ biochemical databases
Manual Literature Curation
Text correlation / mining
CAZy – Phylo -Genetics / -Genomics: Identify Orthologs and Paralogs
Identify Analogs -- Convergent Evolution
Distinguish close / remote relationships
Enzyme discovery in a Single Genome
Search and list all the CAZymes Infer Properties (Mechanism / Fold) from Families Infer Function from SubFamilies and Known
Biochemically Characterized Cases
Compare CAZyme content of Multiple Genomes
Correlate CAZyme content with Lifestyle
Discover singularities in Genomes
Understand Genome Evolution
CAZy: On the Genomic ScaleCAZy: On the Genomic Scale
© C
outi
nho,
Danch
in &
Henri
ssat,
2
00
7
Annotations of CAZymes in Genomes
Annotations of CAZymes in Genomes
Modular Annotation Identify modules
Identify gene models with major problems (large truncations, insertions, frameshifts, etc)
Identify Signal peptides, Linkers, GPI-anchors, TMs
Functional Annotation Sequence similarity to characterized enzymes Make use of Subfamilies with characterized
enzymes for reliable annotation Characterized in the literature
Provide annotations that will “age well” Several Levels / Categories:
Know Cases (++) :EC activity assignment High Similarity (+) : “candidate” activity Medium Similarity (-) : “related to” Low Similarity (--) : “distantly related to” (taxon) activity
Interpretation Analogies with better characterized genomes Singularities in enzyme distribution Interaction with Consortia Biologists
CAZY Database and WebsiteCAZY Database and Website
Genome Annotation and Comparative Genome Annotation and Comparative
GenomicsGenomics
Annotation Highlights from Melampsora
laricis-populina
Interpretation and SpeculationInterpretation and Speculation
Sequence Similarity based Modular Analysis of CAZymes
Sequence Similarity based Modular Analysis of CAZymes
Genome Sequences
Filter against CAZY Sequences using BLASTP CAZymes
Identify Modular Structure using HMMs of Modular Families
Modular Annotation
CAZyModO : Genomic entry (1.ModO; 2.Function)
CAZyModO : Genomic entry (1.ModO; 2.Function)
Modularity in a Genome:Melampsora laricis-populina
Modularity in a Genome:Melampsora laricis-populina
© Coutinho & Henrissat, 2007
SS-based FunctionalAnalysis
of CAZymes
SS-based FunctionalAnalysis
of CAZymes
Activities in a Genome:Melampsora laricis-populina
Activities in a Genome:Melampsora laricis-populina
Fungal CAZymes : M_lari vs Global Trends GH GT PL CBM LifeStyle
S_cere 45 67 0 12 Saprophite
A_nige 239 109 8 40 Saprophite
A_oryz 283 114 21 33 Saprophite
B_fuck 223 92 9 64 PhytoPath.
T_mela 91 96 3 25 Symbiont
M_gris 231 92 4 63PhytoPath.
H_jeco 192 93 3 41Saprophite
G_zeae 242 102 20 62PhytoPath.
N_cras 171 74 3 41Saprophite
P_anse 229 88 7 75Saprophite
S_pomb 46 61 0 8Saprophite
C_neof 81 66 3 10 Pathogen
P_chry 179 66 4 47Saprophite
L_bico 162 88 7 26 Symbiont
C_cine 210 72 1390 Saprophite
M_lari 176 93 6 10 PhytoPath.P_gram 157 88 4 11 PhytoPath.
U_mayd 97 64 1 9 PhytoPath.
Normal GT set
Medium GH
Low PL / CBM set
Fungal Genomes: CAZyme Family & Functional Annotation
Fungal Genomes: CAZyme Family & Functional Annotation
Objectives
Attribution of CAZymes to Families
Annotation based on Biochemically Characterized cases
Understand Evolution
A.fumigatusA.fumigatusA.nidulansA.nidulans
M.griseaM.grisea
H.jecorinaH.jecorinaN.crassaN.crassa
C.albicansC.albicans
C.glabrataC.glabrata
L.bicolorL.bicolor
EurotiomycetesEurotiomycetes
SordariomycetesSordariomycetes
SaccharomycotinaSaccharomycotina
AscomycotaAscomycota
BasidiomycotaBasidiomycota
HyménomycetesHyménomycetes
G.zeaeG.zeae
S.cerevisiaeS.cerevisiae
C.neoformansC.neoformans
P.chrysosporiumP.chrysosporium
S.pombeS.pombe ArchaeascomycetesArchaeascomycetes
A.nigerA.nigerA.oryzaeA.oryzae
U.maydisU.maydis
© C
outi
nho,
Danch
in &
Henri
ssat,
2
00
6
P.anserinaP.anserina
S. sclerotiorumS. sclerotiorum
R. oryzaeR. oryzae ZygomycotaZygomycota
Fungal Genome
Crunching
Kluyveromyces lactis NRRL Y-1140Pichia stipitis CBS 6054Saccharomyces cerevisiae S288CDebaryomyces hansenii CBS767Eremothecium gossypii ATCC 10895Yarrowia lipolytica CLIB99Candida albicans - Private Candida glabrata CBS138Phaeosphaeria nodorum SN15 - Private Aspergillus nidulans FGSC A4 v.2Aspergillus nidulans FGSC A4 v.3 - Private Aspergillus clavatus NRRL 1 [- Private Aspergillus flavus NRRL3357 - Private Aspergillus niger CBS 513.88 – (2007) Aspergillus niger ATCC 1015 - Private Aspergillus niger CBS 513.88 - Private Aspergillus oryzae RIB 40Aspergillus fumigatus Af293 - Private Aspergillus terreus NIH2624 - Private Coccidioides immitis RS - Private Sclerotinia sclerotiorum 1980 - Private Botryotinia fuckeliana T4 - Private Tuber melanosporum - Private Magnaporthe grisea 70-15Hypocrea jecorina – Private (2008) Gibberella zeae - Private Fusarium verticillioides 7600 - Private Nectria haematococca mpVI - Private Fusarium oxysporum lycopersici - PrivateCryphonectria parasitica EP155 v1 - Private Neurospora crassa OR74A Chaetomium globosum CBS 148.51 - Private Podospora anserina – Private (2008) Schizosaccharomyces pombe 972h-Schizosaccharomyces japonicus yFS275 - Private
Cryptococcus neoformans H99 - Private Cryptococcus neoformans var. neoformans JEC21Postia placenta Mad-698-R - Private Phanerochaete chrysosporium – Private (2004) Laccaria bicolor – Private (2008) Coprinopsis cinerea- Private Melampsora laricis-populina - Private Puccinia graminis f. tritici - Private Ustilago maydis - Private Malassezia globosa CBS 7966 – Private Rhizopus oryzae RA 99-880 – Private Batrachochytrium dendrobatidis JAM81 – Private Encephalitozoon cuniculi GB-M1
>35 Private (Consortia + Extra)and/or
15 Public @ www.cazy.org
Orthologous Distance Fungal CAZymes(Preliminary Results)
Kluyveromyces lactis NRRL Y-1140Pichia stipitis CBS 6054Saccharomyces cerevisiae S288CDebaryomyces hansenii CBS767Eremothecium gossypii ATCC 10895Yarrowia lipolytica CLIB99Candida albicans - Private Candida glabrata CBS138Phaeosphaeria nodorum SN15 - Private Aspergillus nidulans FGSC A4 v.2/v.3 - Private Aspergillus clavatus NRRL 1 [- Private Aspergillus flavus NRRL3357 - Private Aspergillus niger CBS 513.88 Private – (2007) Aspergillus niger ATCC 1015 - Private Aspergillus oryzae RIB 40Aspergillus fumigatus Af293 - Private Aspergillus terreus NIH2624 - Private Coccidioides immitis RS - Private Sclerotinia sclerotiorum 1980 - Private Botryotinia fuckeliana T4 - Private Tuber melanosporum - Private Magnaporthe grisea 70-15Hypocrea jecorina – Private (2008) Gibberella zeae - Private Fusarium verticillioides 7600 - Private Nectria haematococca mpVI - Private Fusarium oxysporum lycopersici - PrivateCryphonectria parasitica EP155 v1 - Private Neurospora crassa OR74A Chaetomium globosum CBS 148.51 - Private Podospora anserina – Private (2008) Schizosaccharomyces pombe 972h-Schizosaccharomyces japonicus yFS275 – Private Cryptococcus neoformans H99 - Private Cryptococcus neoformans var. neoformans JEC21Postia placenta Mad-698-R - Private Phanerochaete chrysosporium – Private (2004) Laccaria bicolor – Private (2008) Coprinopsis cinerea- Private Melampsora laricis-populina - Private Puccinia graminis f. tritici - Private Ustilago maydis - Private Malassezia globosa CBS 7966 – Private Rhizopus oryzae RA 99-880 – Private Batrachochytrium dendrobatidis JAM81 – Private Encephalitozoon cuniculi GB-M1
« Rusts »
CAZY Database and WebsiteCAZY Database and Website
Genome Annotation and Comparative Genome Annotation and Comparative
GenomicsGenomics
Annotation Highlights from Annotation Highlights from Melampsora Melampsora
laricis-populinalaricis-populina
Interpretation and Speculation
Host–Rust Parasite Interaction
Interaction between rust and host is initiated on external surface. The haustorial mother cell produces a narrow peg that penetrates the host cell wall. Pathogen-secreted molecules inside
the host cell suppress host defence and enhance susceptibility
Maheshwari R. The scourge of mankind: From ancient time into the genomic era. Current Science. 2007 (9) 1249-1256.
Infection
Upon penetration of the plant cell wall by enzymatic dissolution, an haustorium is formed in the periplasmic space of the host cell.
The interface between the plant and fungal cytoplasm consists of A gel like layer consisting of
carbohydrates (extrahaustorial matrix)
Extrahaustorial membrane -- derived from the plant cell wall.
The haustorium is directly connected to the mother cell so that nutrients can be transported from the plant cell to the developing fungal hyphae.
Leonard KL and Szabo LJ. Molecular Plant Pathology (2005). 6 (2), 99-111
M_lari vs Fungal GHs : HighlightsGH 1 2 3 5 7 10 11 12 13 15 16 17 18 20 26 27 28 32 43 47 51 61 78 88 105
S_cere 0 0 0 5 0 0 0 0 8 1 5 4 2 0 0 0 1 1 0 3 0 0 0 0 0A_nige 3 6 17 10 2 1 4 4 18 2 13 5 14 3 1 4 21 6 10 5 4 7 8 1 2A_oryz 3 7 23 13 3 4 4 4 17 3 13 5 18 3 1 3 20 4 20 5 3 8 8 3 2B_fuck 3 2 16 15 2 2 3 4 10 4 21 6 10 1 2 4 18 1 4 8 3 9 8 1 1T_mela 2 2 6 6 0 1 0 1 8 1 7 4 5 2 0 0 2 1 1 5 0 4 2 0 0M_gris 2 6 19 13 6 5 5 3 10 2 16 7 14 2 0 4 3 5 19 6 3 17 1 1 3H_jeco 2 7 13 11 2 1 4 2 5 2 16 4 20 3 0 8 4 0 2 8 0 3 1 0 1G_zeae 3 10 22 15 2 5 3 4 8 3 21 6 19 2 0 2 6 5 17 10 2 15 7 1 3P_anse 1 7 11 15 6 8 6 2 9 3 12 4 20 1 1 2 0 0 10 9 1 33 1 0 0S_pom 0 0 1 3 0 0 0 0 12 2 3 1 1 0 0 1 0 2 0 2 0 0 0 0 0C_neof 0 0 7 10 0 0 0 0 10 2 12 1 4 1 0 0 1 1 0 3 1 1 3 2 1P_chry 2 2 11 20 9 6 1 2 9 2 23 1 11 3 0 3 4 0 4 6 2 15 1 1 0L_bico 0 2 2 22 0 0 0 3 8 2 31 3 10 2 0 1 6 0 0 9 0 8 0 2 0C_cine 2 2 7 27 7 5 6 1 9 4 32 3 9 2 0 0 3 0 4 8 1 33 0 1 1M_lari 0 4 3 30 8 6 0 10 8 4 11 1 15 3 5 7 3 2 8 14 3 2 0 0 1P_gram 0 10 2 27 8 5 0 3 5 3 9 1 17 2 5 12 1 2 2 14 0 3 0 0 0U_may 0 1 3 12 0 2 1 0 6 1 21 2 3 2 0 1 1 2 4 3 2 0 0 0 1M_glob 0 0 1 6 0 0 0 0 0 0 7 0 1 0 0 0 0 0 1 2 0 0 0 0 0
PCW PCW PCW CW PCW PCW PCW PCW Gly Gly FCW FCW FCW FCW ? ? PCW Suc PCW FCW PCW CW PCW PCW PCWS S S S S S S S
Low Plant Cell-Wall (PCW) saccharification (S) capacity (GH1, 3, 43, 78…) Original combination of high GH7,10,12 but absent GH11 Large number of GH26,27 but unknown specificity (extrahaustorial matrix?) Capacity to saccharify sucrose (GH32) that is absent from PCW-saccharifying fungi Normal FCW-aiming enzymes but probably large set in CW-targeting family GH5 Differences w/ P_gram may reflect host specificity (Dicot/Monocot?)
M_lari vs Fungal CBMs : Highlights
No CBMs aiming at Plant Cell-Wall (PCW) Few CBMs aiming at Fungal Cell-Wall (FCW)
CBM 1 12 13 18 19S_cere 0 0 0 2 1A_nige 8 0 1 13 0A_oryz 3 0 2 5 1B_fuck 18 0 1 16 0T_mela 1 0 0 16 1M_gris 22 0 0 33 0H_jeco 15 0 3 8 0G_zeae 12 0 2 34 0P_anse 30 0 0 30 0S_pomb 2 0 0 0 0C_neof 0 0 5 1 0P_chry 31 0 5 1 0L_bico 1 1 10 1 1C_cine 46 1 24 1 2M_lari 0 1 0 0 5P_gram 0 1 0 0 3U_mayd 0 0 0 2 0M_glob 0 0 0 0 0
PCW
FCW
PCW
FCW
FCW
M_lari : Main CAZy Conclusions
• An original distribution of CAZymes mostly shared with P_gram (where differences may relate w/ host)
• Sufficient degrading GH + PL (not shown) enzymes to perforate the Plant Cell Wall, and form the Haustorium, but not for its saccharification
• GH32 invertases present to saccharify Sucrose (like P_gram and U_mayd)
• Open Question : Are some enzymes present to destroy oligosaccharide elicitors (resulting from FCM-degradation by plant enzymes) and diminish plant response?
CAZy - Team & CAZy - Team & FundingFundingCAZy - Team & CAZy - Team & FundingFunding
Bernard Henrissat (DR1)Bernard Henrissat (DR1)
Pedro Coutinho (PR2)Pedro Coutinho (PR2)
Brandi Cantarel Brandi Cantarel (Post-Doc)(Post-Doc)
Corinne Rancurel (IE - Bioinformatics)Corinne Rancurel (IE - Bioinformatics)
Vincent Lombard (IE - DB Expert)Vincent Lombard (IE - DB Expert)
Thomas Bernard (PhD Student) (2008)Thomas Bernard (PhD Student) (2008)
Centre National de la Recherche ScientifiqueCentre National de la Recherche Scientifique Aix-Marseille Universités Aix-Marseille Universités ANR-PNRB: E-TriCelANR-PNRB: E-TriCel
© C
outi
nho &
Henri
ssat,
2
00
8