10
Origin and Molecular Evolution of Receptor Tyrosine Kinases with Immunoglobulin-Like Domains Julien Grassot,* Manolo Gouy, Guy Perrie `re, and Guy Mouchiroud* *Centre de Ge ´ne ´tique Mole ´culaire et Cellulaire, UMR Centre National de la Recherche Scientifique 5534, Universite ´ Claude Bernard—Lyon 1, Villeurbanne, France; and  Laboratoire de Biome ´trie et Biologie E ´ volutive, UMR Centre National de la Recherche Scientifique 5558, Universite ´ Claude Bernard—Lyon 1, Villeurbanne, France Receptor tyrosine kinases (RTKs) are involved in the control of fundamental cellular processes in metazoans. In verte- brates, RTK could be grouped in distinct classes based on the nature of their cognate ligand and modular composition of their extracellular domain. RTK with immunoglobulin-like domains (IG-like RTK) encompass several RTK classes and have been found in early metazoans, including sponges. Evolution of IG-like RTK is characterized by extended molecular and functional diversification, which prompted us to study their evolutionary history. For that purpose, a nonredundant data set including annotated protein sequences of IG-like RTK (n 5 85) was built, representing 19 species ranging from sponges to humans. Phylogenetic trees were generated from alignment of conserved regions using maximum likelihood approach. Molecular phylogeny strongly suggests that IG-like RTK diversification occurred according to a complex scenario. In particular, we propose that specific cis duplications of a common ancestor to both platelet-derived growth factor receptor (class III) and vascular endothelial growth factor receptor (class V) families preceded two trans duplications. In contrast, other IG-like RTK genes, like Musk and PTK7, apparently did not evolve by duplications, whereas fibroblast growth factor receptors (class IV) evolved through two rounds of trans duplications. The proposed model of IG-like RTK evolution is supported by high bootstrap values and by the clustering of genes encoding class III and class V RTKs at specific chromosomal locations in mouse and human genomes. Introduction Receptor tyrosine kinases (RTKs) are metazoan- specific plasma membrane receptors that control multiple fundamental cellular processes during development and in adult life, such as cell cycle, migration, metabolism, survival, proliferation, and differentiation. RTKs are trans- membrane proteins sharing two major functional domains: the extracellular ligand-binding domain and the intracellu- lar tyrosine kinase domain that distinguishes RTK from all other receptors. In vertebrates, especially in human and mouse, molecular phylogeny analysis of amino acid sequences of the conserved kinase domain was used to define an RTK classification that shows good concor- dance with the modular structure of extracellular domains (Hubbard and Till 2000; Robinson, Wu, and Lin 2000; Kostich et al. 2002). Genome sequencing as well as specific investigations point to early appearance of RTK in metazoans and intense diversification within some RTK subfamilies. The first RTK likely arose from fusion of an epidermal growth factor (EGF)–like domain and a cytoplasmic tyrosine kinase be- fore the appearance of animals (King and Caroll 2001). In the freshwater sponge Ephydatia fluviatilis, nine putative RTK genes were identified following reverse transcription– polymerase chain reaction amplification, of which four are related to the RTK genes found in Drosophila mel- anogaster and vertebrates: the Musk, ephrin (Eph), Ros, and EGF receptors (Suga, Kato, and Miyata 2001). Besides well-conserved RTK subfamilies, other RTK genes have been found in early metazoans, Caenorhabditis elegans, and D. melanogaster, and several have no orthologs or mammalian paralogs (Plowman et al. 1999; Popovici et al. 1999; Adams et al. 2000; Miller and Steele 2000; Vicogne et al. 2003). Furthermore, sequencing projects are generating large amounts of predicted RTK sequences that may be difficult to annotate based on the current classifica- tion, raising the question of whether they represent a novel class of RTK or simply result from specific evolutionary history. In this respect, understanding RTK evolution would help global RTK classification, which in turn might facilitate annotation of new RTK sequences and the use of new model organisms to study human RTK function. As mentioned above, the number and diversity of RTK genes sharply increased during the metazoan evolu- tion, resulting in complex nomenclature and phylogeny. This is especially true for Eph receptors and RTK with immunoglobulin-like domains (IG-like RTK) that appeared during the early stages of animal evolution and represent the most abundant RTK classes in vertebrates (Muller et al. 1999; Drescher 2002). Due to widespread distribution within the genome, RTK genes may then provide a useful tool for evolutionary studies, especially on the extent of gene duplication during metazoan evolution and its contri- bution to genome complexity. Phylogenetic analyses have been carried out for Eph receptors and IG-like RTK, but they used either selected receptors or limited sets of species (Rousset et al. 1995; Parichy et al. 2000; Drescher 2002; Satou et al. 2003). Although these studies suggested a com- mon evolutionary origin within each receptor family, un- derlying mechanisms are still elusive. This prompted us to establish a global and comprehensive molecular phylog- eny for RTK. For this purpose, we concentrated here on the evolution of IG-like RTK genes. We first established a rep- resentative data set of protein sequences from IG-like RTK among which about one-third precede tetrapods/teleosts divergence. Phylogeny analyses were performed using a maximum likelihood algorithm allowing to build large phylogenies in a reasonable computing time (Guindon and Gascuel 2003). Our results support a monophyletic or- igin of IG-like RTK, at least for classes III (platelet-derived Key words: immunoglobulin-like domain, receptor tyrosine kinase, 2R hypothesis. E-mail: [email protected]. Mol. Biol. Evol. 23(6):1232–1241. 2006 doi:10.1093/molbev/msk007 Advance Access publication March 21, 2006 Ó The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] at Univ Massachusetts Healey Library on August 21, 2014 http://mbe.oxfordjournals.org/ Downloaded from

Origin and Molecular Evolution of Receptor Tyrosine Kinases with Immunoglobulin-Like Domains

  • Upload
    j

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Origin and Molecular Evolution of Receptor Tyrosine Kinases withImmunoglobulin-Like Domains

Julien Grassot,* Manolo Gouy,� Guy Perriere,� and Guy Mouchiroud**Centre de Genetique Moleculaire et Cellulaire, UMR Centre National de la Recherche Scientifique 5534, Universite ClaudeBernard—Lyon 1, Villeurbanne, France; and �Laboratoire de Biometrie et Biologie Evolutive, UMR Centre National de la RechercheScientifique 5558, Universite Claude Bernard—Lyon 1, Villeurbanne, France

Receptor tyrosine kinases (RTKs) are involved in the control of fundamental cellular processes in metazoans. In verte-brates, RTK could be grouped in distinct classes based on the nature of their cognate ligand and modular composition oftheir extracellular domain. RTK with immunoglobulin-like domains (IG-like RTK) encompass several RTK classes andhave been found in early metazoans, including sponges. Evolution of IG-like RTK is characterized by extended molecularand functional diversification, which prompted us to study their evolutionary history. For that purpose, a nonredundant dataset including annotated protein sequences of IG-like RTK (n5 85) was built, representing 19 species ranging from spongesto humans. Phylogenetic trees were generated from alignment of conserved regions using maximum likelihood approach.Molecular phylogeny strongly suggests that IG-like RTK diversification occurred according to a complex scenario. Inparticular, we propose that specific cis duplications of a common ancestor to both platelet-derived growth factor receptor(class III) and vascular endothelial growth factor receptor (class V) families preceded two trans duplications. In contrast,other IG-like RTK genes, like Musk and PTK7, apparently did not evolve by duplications, whereas fibroblast growth factorreceptors (class IV) evolved through two rounds of trans duplications. The proposed model of IG-like RTK evolutionis supported by high bootstrap values and by the clustering of genes encoding class III and class V RTKs at specificchromosomal locations in mouse and human genomes.

Introduction

Receptor tyrosine kinases (RTKs) are metazoan-specific plasma membrane receptors that control multiplefundamental cellular processes during development andin adult life, such as cell cycle, migration, metabolism,survival, proliferation, and differentiation. RTKs are trans-membrane proteins sharing two major functional domains:the extracellular ligand-binding domain and the intracellu-lar tyrosine kinase domain that distinguishes RTK fromall other receptors. In vertebrates, especially in humanand mouse, molecular phylogeny analysis of amino acidsequences of the conserved kinase domain was used todefine an RTK classification that shows good concor-dance with the modular structure of extracellular domains(Hubbard and Till 2000; Robinson, Wu, and Lin 2000;Kostich et al. 2002).

Genome sequencing as well as specific investigationspoint to early appearance of RTK in metazoans and intensediversification within some RTK subfamilies. The firstRTK likely arose from fusion of an epidermal growth factor(EGF)–like domain and a cytoplasmic tyrosine kinase be-fore the appearance of animals (King and Caroll 2001). Inthe freshwater sponge Ephydatia fluviatilis, nine putativeRTK genes were identified following reverse transcription–polymerase chain reaction amplification, of which fourare related to the RTK genes found in Drosophila mel-anogaster and vertebrates: the Musk, ephrin (Eph), Ros,and EGF receptors (Suga, Kato, and Miyata 2001). Besideswell-conserved RTK subfamilies, other RTK genes havebeen found in early metazoans, Caenorhabditis elegans,and D. melanogaster, and several have no orthologs ormammalian paralogs (Plowman et al. 1999; Popovici

et al. 1999; Adams et al. 2000; Miller and Steele 2000;Vicogne et al. 2003). Furthermore, sequencing projects aregenerating large amounts of predicted RTK sequences thatmay be difficult to annotate based on the current classifica-tion, raising the question of whether they represent a novelclass of RTK or simply result from specific evolutionaryhistory. In this respect, understanding RTK evolutionwould help global RTK classification, which in turn mightfacilitate annotation of new RTK sequences and the use ofnew model organisms to study human RTK function.

As mentioned above, the number and diversity ofRTK genes sharply increased during the metazoan evolu-tion, resulting in complex nomenclature and phylogeny.This is especially true for Eph receptors and RTK withimmunoglobulin-like domains (IG-like RTK) that appearedduring the early stages of animal evolution and representthe most abundant RTK classes in vertebrates (Mulleret al. 1999; Drescher 2002). Due to widespread distributionwithin the genome, RTK genes may then provide a usefultool for evolutionary studies, especially on the extent ofgene duplication during metazoan evolution and its contri-bution to genome complexity. Phylogenetic analyses havebeen carried out for Eph receptors and IG-like RTK, butthey used either selected receptors or limited sets of species(Rousset et al. 1995; Parichy et al. 2000; Drescher 2002;Satou et al. 2003). Although these studies suggested a com-mon evolutionary origin within each receptor family, un-derlying mechanisms are still elusive. This prompted usto establish a global and comprehensive molecular phylog-eny for RTK. For this purpose, we concentrated here on theevolution of IG-like RTK genes. We first established a rep-resentative data set of protein sequences from IG-like RTKamong which about one-third precede tetrapods/teleostsdivergence. Phylogeny analyses were performed using amaximum likelihood algorithm allowing to build largephylogenies in a reasonable computing time (Guindonand Gascuel 2003). Our results support a monophyletic or-igin of IG-like RTK, at least for classes III (platelet-derived

Key words: immunoglobulin-like domain, receptor tyrosine kinase,2R hypothesis.

E-mail: [email protected].

Mol. Biol. Evol. 23(6):1232–1241. 2006doi:10.1093/molbev/msk007Advance Access publication March 21, 2006

� The Author 2006. Published by Oxford University Press on behalf ofthe Society for Molecular Biology and Evolution. All rights reserved.For permissions, please e-mail: [email protected]

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from

growth factor [PDGF] receptors), IV (fibroblast growth fac-tor [FGF] receptors), and V (vascular endothelial growthfactor [VEGF] receptors) and point to gene duplication/lossevents that resulted in the current repertoire.

Material and Methods

First, a data set of IG-like RTK protein sequenceswas established from the RTKdb database (Grassot,Mouchiroud, and Perriere 2003), which was developedby collecting RTK sequences from the UniProt collection(Bairoch et al. 2005) and arranging them into classes.Information on the RTKdb database can be accessed athttp://pbil.univ-lyon1.fr/RTKdb/. Additional sequenceswere collected through TBlastN (Altschul et al. 1997)searches in the European Molecular Biology Laboratory(Kanz et al. 2005), Ensembl (Hubbard et al. 2005), andJoint Genome Institute (http://www.jgi.doe.gov/) data-bases. For that purpose, a first search was performed usingvertebrate IG-like RTK sequences as baits, and resultinghits were recursively used for new runs as long as moremembers of the RTK family were obtained. The resultingdata set contained 111 complete sequences of RTK withIG-like domains from various organisms. From this dataset, 81 sequences were from posttetrapods/postteleosts di-vergence species. Sequences from Rattus norvegicus andMus musculus were removed because they did not add rel-evant information to that brought by human sequences.This reduced the data set to 85 sequences (table 1).

Alignments were performed using MUSCLE (Edgar2004) with default parameter values, and reliably alignedregions were selected with Gblocks (Castresana 2000).The minimum length for conserved blocks was set to fiveresidues, and we choose to keep positions containing gapsonly if less than 50% of the sequences had a gap. Resultingalignments were bootstraped 1,000 times with the programSEQBOOT from the PHYLIP package (Felsenstein 1989).Phylogenetic trees were computed with the maximum like-lihood method implemented in PhyML (Guindon andGascuel 2003). The Jones-Taylor-Thornton model of aminoacid substitution was used (Jones, Taylor, and Thornton1992). Across-site rate variation was modeled by a gammadistribution with four classes of substitution rates. Alphaparameter of the gamma distribution was estimated byPhyML. The addBootstrap program (distributed upon re-quest by Manolo Gouy) allowed us to merge bootstrapscores and branch lengths in a single tree. At last, phylo-genetic trees were drawn with NJplot (Perriere and Gouy1996). Computations were performed on the IN2P3 Linuxcluster containing more than 1,000 CPUs. All alignmentsand trees can be downloaded at ftp://pbil.univ-lyon1.fr/pub/datasets/MBE06.

Species are denoted as follows, with a UniProt-likenomenclature: ANOGA, Anopheles gambiae; BRARE,Brachydanio rerio; CAEEL, C. elegans; CIOIN, Ciona in-testinalis; COTJA, Coturnix coturnix japonica; DROME,D. melanogaster; DUGJA, Dugesia japonica; EPHFL, E.fluviatilis; FUGRU, Fugu rubripes; CHICK, Gallus gallus;GEOCY, Geodia cydonium; HALRO, Halocynthia roretzi;HUMAN, Homo sapiens; HYDAT, Hydra attenuata;NOTVI, Notophthalmus viridescens; PLEWA, Pleurodeles

waltlii; STRPU, Strongylocentrotus purpuratus; TORCA,Torpedo californica; and XENLA, Xenopus laevis.

Results and DiscussionEarly Divergence of Musk/PTK7 and Other RTKwith IG-Like Domains

In order to establish a relevant molecular phylogeny,we focused our investigations on IG-like RTK classesfound in all investigated metazoan species, thereby exclud-ing classes VII (Rikke, Murakami, and Johnson 2000), VIII(Jaaro et al 2001), and IX (J. Grassot and G. Mouchiroud,unpublished data). The conserved regions of IG-like RTKfrom 23 species, among which eight had their genome com-pletely sequenced, were used to compute the phylogenetictree shown in figure 1. This tree was rooted with class IIRTK (insulin-related receptors) because these receptorswere also present at the very beginning of RTK evolution(Aguinaldo et al. 1997). It shows two major groups corre-sponding to RTK related to class XVII and class XIX andRTK related to classes III, IV, and V. This dichotomy hasa bootstrap support of 86%, which suggests early diver-gence of both groups of IG-like RTK. Figure 1 also showsa poorly resolved group of RTK sequences of three types:class XIX, class XVII, and a group of receptors mainlyfound in sponges (GCTK_GEOCY, RTK2_GEOCY,RTK_GEOCY, and EPTK_EPHFL) and ascidians (RTK1_CIOIN and RTK3_CIOIN). Grouping of the H. attenuatasequence with all PTK7 sequences is supported by highbootstrap value (96%), which indicates that this sequenceis homologous to PTK7. Interestingly, PTK7 is a kinase-defective RTK (Mossie et al. 1995), which suggests thatstrong conservation of this receptor during metazoan evolu-tion was due to constraint on biological rather than catalyticfunction of the molecule. Indeed, it was recently shown thatPTK7 is an important regulator of cell polarity in drosophilaand mammals (Lu et al. 2004). Similar to PTK7 sequences,class XIX RTK (Musk) grouped in a single cluster. OtherRTK sequences shown in figure 1 segregated in a heter-ogeneous cluster of sequences, including both porifera (G.cydonium) and deuterostomes (C. intestinalis), yet we couldnot determine with significant bootstrap support the evolu-tionary relationships between these RTKs. More extensivetaxonomical sampling is required to clarify this point.

A Common Ancestor to Subclasses III, IV, and V

Contrasting with the low diversification among classXVII, class XIX, and closely related RTKs, the phylogenyshown in figure 1 suggests complex mechanisms leading tothe generation of class III, class IV, and class V RTKs. Bycombining molecular phylogeny of human IG-like RTKand analysis of exon/intron structure of correspondinggenes, Rousset et al. (1995) proposed that class III, classIV, and class V RTK genes evolved from a single ancestorgene by successive duplications. Further phylogenetic anal-yses supported this model, yet they focused on restrictedset of sequences (Heino et al. 2001; J. Gu and X. Gu 2003;Satou et al. 2003). The present phylogenetic tree, generatedwith an extended data set, clearly confirms that a duplicationevent first permitted divergence between class IV and the

Evolution of RTK with IG-Like Domains 1233

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from

Table 1List of the IG-Like RTK Sequences Used in This Study

Common Name Species Original Accession Numbers Proposed Nomenclature

African clawed frog Xenopus laevis P26619 PGFRA_XENLAAfrican clawed frog Xenopus laevis Q91909 KIT_XENLAChicken Gallus gallus Q08156 KIT_CHICKChicken Gallus gallus Q9PUF6 PGFRA_CHICKHuman Homo sapiens P36888 FLT3_HUMANHuman Homo sapiens P07333 FMS_HUMANHuman Homo sapiens P10721 KIT_HUMANHuman Homo sapiens P09619 PGFRB_HUMANHuman Homo sapiens P16234 PGFRA_HUMANJapanese puffer fish Fugu rubripes P79749 PGFRB1_FUGRUJapanese puffer fish Fugu rubripes P79750 FMS1_FUGRUJapanese puffer fish Fugu rubripes Q8AXC6 KIT_FUGRUJapanese puffer fish Fugu rubripes Q8AXC7 PGFRA_FUGRUJapanese puffer fish Fugu rubripes Q8UVR8 FMS2_FUGRUJapanese puffer fish Fugu rubripes Q8UVR9 PGRFB2_FUGRUJapanese puffer fish Fugu rubripes SINFRUP00000158437a FLT3_FUGRUZebrafish Brachydanio rerio Q8JFR5 KIT_BRAREZebrafish Brachydanio rerio Q9DE49 PGFRA_BRAREZebrafish Brachydanio rerio Q9I8N6 FMS_BRAREAfrican clawed frog Xenopus laevis P22182 FGR1_XENLAAfrican clawed frog Xenopus laevis Q03364 FGR2_XENLAAfrican clawed frog Xenopus laevis O42127 FGR3_XENLAAfrican clawed frog Xenopus laevis Q91743 FGR4_XENLAAscidian Ciona intestinalis Q4H3K6 FGR_CIOINChicken Gallus gallus P18460 FGR3_CHICKChicken Gallus gallus P18461 FGR2_CHICKChicken Gallus gallus P21804 FGR1_CHICKEastern newt Notophthalmus viridescens Q91147 FGR2_NOTVIFruit fly Drosophila melanogaster Q07407 FGR1_DROMEFruit fly Drosophila melanogaster Q09147 FGR2_DROMEHuman Homo sapiens P11362 FGR1_HUMANHuman Homo sapiens P21802 FGR2_HUMANHuman Homo sapiens P22607 FGR3_HUMANHuman Homo sapiens P22455 FGR4_HUMANHydra vulgaris Hydra attenuata Q86PM4 FGR_HYDATIberian ribbed newt Pleurodeles waltlii Q91285 FGR1_PLEWAIberian ribbed newt Pleurodeles waltlii Q91286 FGR2_PLEWAIberian ribbed newt Pleurodeles waltlii Q91287 FGR3_PLEWAIberian ribbed newt Pleurodeles waltlii Q91288 FGR4_PLEWAJapanese puffer fish Fugu rubripes SINFRUP00000128473a FGR1_FUGRUJapanese puffer fish Fugu rubripes SINFRUP00000143771a FGR4_FUGRUJapanese puffer fish Fugu rubripes SINFRUP00000147354a FGR3_FUGRUJapanese puffer fish Fugu rubripes SINFRUP00000160770a FG1B_FUGRUJapanese quail Coturnix coturnix japonica Q90330 FGR4_COTJAMosquito Anopheles gambiae Q7QBL9 RTK2_ANOGAPlanarian Dugesia japonica Q8MY85 FGR2_DUGJAPlanarian Dugesia japonica Q8MY86 FGR1_DUGJAPurple sea urchin Strongylocentrotus purpuratus Q26614 FGR_STRPURoundworm Caenorhabditis elegans Q10656 EG15_CAEELSea squirt Halocynthia roretzi Q95YM9 FGR_HALROZebrafish Brachydanio rerio Q8JG38 FGR2_BRAREZebrafish Brachydanio rerio Q90413 FGR4_BRAREZebrafish Brachydanio rerio Q90Z00 FGR1_BRAREZebrafish Brachydanio rerio Q9I8X3 FGR3_BRAREAscidian Ciona intestinalis Q4H2M8 VGR_CIOINChicken Gallus gallus Q8QHL3 VGR1_CHICKHuman Homo sapiens P17948 VGR1_HUMANHuman Homo sapiens P35968 VGR2_HUMANHuman Homo sapiens P35916 VGR3_HUMANJapanese puffer fish Fugu rubripes SINFRUP00000133464a VGR2_FUGRUJapanese puffer fish Fugu rubripes SINFRUP00000153089a VGR3_FUGRUJapanese quail Coturnix coturnix japonica P79701 VGR3_COTJAJapanese quail Coturnix coturnix japonica P52583 VGR2_COTJAZebrafish Brachydanio rerio Q8AXB3 VGR_BRAREZebrafish Brachydanio rerio Q5MD89 VGR3_BRAREZebrafish Brachydanio rerio Q5GIT4 VGR2_BRAREChicken Gallus gallus Q91048 PTK7_CHICKFruit fly Drosophila melanogaster Q9V643 PTK7_DROMEHydra vulgaris Hydra attenuata Q25198 PTK7_HYDATHuman Homo sapiens Q13308 PTK7_HUMAN

1234 Grassot et al.

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from

cluster of class III and class V RTKs. This event may bedated before the protostomes/deuterostomes split due tothe presence of protostome sequences in the class IV andin the cluster of classes III and V.

Evolution of the FGF Receptors (Class IV)

All FGF receptors grouped in a single cluster. How-ever, sequences from species exhibiting four FGF receptorgenes grouped together, and were clearly distinguishablefrom other FGF sequences, found in protostomes (nematode:EG15_CAEEL; planarian: FGR1_DUGJA and FGR2_DUGJA; insects: FGR1_DROME, FGR2_DROME, andRTK2_ANOGA), ascidian (C. intestinalis: FGR_CIOIN;sea squirt: FGR_HALRO), echinoderms (sea urchin:FGR_STRPU), and cnidarians (hydra: FGR_HYDAT). Ad-ditional phylogenies were performed using restricted com-binations of the latter sequences (i.e., by specificallyremoving some of them from the whole data set). They con-firmed grouping of these sequences at the base of the FGF re-ceptor tree (data not shown), indicating that gross topology ofFGF receptors as shown in figure 1 was robust and likely notinfluenced by the long-branch attraction phenomenon. Al-though divergence order could not be precisely determined,some of these sequences suggest key dates for the evolutionof thisRTKclass. Indeed,ascidiansequencesofC.intestinalisand H. roretzi are branched at the base of the tree of FGF re-ceptor sequences, suggesting that the duplication events lead-ing to diversification seen in human occurred after thechordates/urochordates divergence.

As noted above, other class IV sequences, from spe-cies that descend from the tetrapods/teleosts split, groupedin four clusters, consistent with human FGF receptor clas-sification (Robinson, Wu, and Lin 2000). For these species,alignments computed on the whole set of sequences ledto a phylogeny whose bootstrap values are not significant(fig. 1). In order to obtain a better resolved phylogeny, weneeded to increase the number of useful sites in the align-ments. For that purpose, we used alignments computed—

and then filtered by Gblocks—only on the sequences be-longing to class IV. The resulting phylogeny showed a sym-metrical topology supported by high bootstrap values(�80%), which strongly suggests that FGF receptor diver-sification after the chordates/urochordates split resultedfrom two successive duplication events: the first duplicationled to FGR1/2 and FGR3/4 groups, and the second one sep-arated FGR1 from FGR2 and FGR3 from FGR4 (fig. 2A). Asimilar topology was observed by Coulier et al. (1997), whosuggested that FGF receptors evolved through successiveduplications in vertebrates. All duplications likely occurredbetween the chordates/urochordates and tetrapods/teleostssplits because fish sequences are present in all fourgroups (FGR1/2/3/4). Interestingly, chordate FGF receptorsare found within paralogous chromosome regions thatwere supposed to evolve from a common ancestral regionthrough several duplications (Pebusque et al. 1998). In thisrespect, robustness of the phylogenetic tree shown in figure2A clearly supports a duplication scheme consistent withthe ‘‘2R’’ hypothesis of two rounds of large-scale duplica-tion in the lineage leading to the vertebrates (Ohno 1970;Taylor and Brinkmann 2001; Wolfe 2001).

Evolutionary Relationship Between Class IIIand Class V

Three sequences from nematode (TKR_CAEEL oth-erwise known as T17A3.1, F5V3_CAEEL, or F59F3.1 andF5V4_CAEEL or F59F3.5) grouped at the root of the clus-ter comprising classes III and V (fig. 1). Several data setwere tested from which sequences shown in figure 1 wereremoved according to various combinations. All resultingphylogenies supported branching of the three C. eleganssequences at the base of classes III/V cluster, which alsorules out possible errors due to long-branch attraction phe-nomenon (data not shown). These sequences have been pre-viously assigned to the VEGF receptor family, based onconservation of the cysteine residues delimiting IG-likedomains (Plowman et al. 1999; Popovici et al. 1999),

Table 1Continued

Common Name Species Original Accession Numbers Proposed Nomenclature

Japanese puffer fish Fugu rubripes SINFRUP00000138380a PTK7_FUGRUChicken Gallus gallus Q8AXY6 MUSK_CHICKHuman Homo sapiens O15146 MUSK_HUMANPacific electric ray Torpedo californica Q07153 MUSK_TORCAAscidian Ciona intestinalis ci0100130025b RTK1_CIOINAscidian Ciona intestinalis ci0100146108b RTK3_CIOINFruit fly Drosophila melanogaster Q95P10 VEPG_DROMERiver sponge Ephydatia fluviatilis Q9Y1Y8 EPTK_EPHFLRoundworm Caenorhabditis elegans O76698 TKR_CAEELRoundworm Caenorhabditis elegans Q21038 F5V3_CAEELRoundworm Caenorhabditis elegans Q21041 F5V4_CAEELRoundworm Caenorhabditis elegans Q9BLY1 KIN8_CAEELSponge Geodia cydonium O18433 RTK_GEOCYSponge Geodia cydonium Q27656 GCTK_GEOCYSponge Geodia cydonium P42159 RTK2_GEOCYHuman Homo sapiens P08069 IGF1R_HUMANFruit fly Drosophila melanogaster P09208 INSR_DROME

NOTE.—The first and second columns give the common name and the systematic name of the species. The original accession number in UniProt database is in the third

column. A proposed nomenclature following UniProt conventions and deduced from our phylogenetic analyses is given in the last column.a The original accession number in Ensembl databases.b The original accession number in Joint Genome Institute databases.

Evolution of RTK with IG-Like Domains 1235

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from

and corresponding loci are now referred to as ver-1, ver-3,and ver-4, respectively. As previously noted (Popovici et al.2002), these sequences did not show strong phylogeneticrelationships with human or chordates VEGF receptors,

which raises the possibility that they are related to theancestral sequence to classes III and V. Alternatively,F5V3_CAEEL, F5V4_CAEEL, and TKR_CAEEL couldrepresent ‘‘bona fide’’ VEGF receptors, and the phylogeny

FIG. 1.—Phylogenetic relationships within IG-like RTK. Maximum likelihood tree was generated from conserved domain using the PhyMLreconstruction method. The tree was rooted using class II RTK sequences. Bootstrap values �75% are shown.

1236 Grassot et al.

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from

shown in figure 1 could result from the higher evolutionaryrate of C. elegans genes compared to that of other metazoa(Aguilnado et al. 1997). Vascular-endothelial-cell/platele-derived growth factor (VEPG) (or PDGF/VEGF) is the onlygene in D. melanogaster genome that encodes RTK relatedto vertebrates PDGF and VEGF receptors (Duchek et al.2001; Heino et al. 2001). Interestingly, VEPG was alsofound in A. gambiae genome (Holt et al. 2002). The pres-ence of seven predicted IG-like domains in extracellularregion as well as its role in blood cell development andmigration suggested that VEPG is closer to VEGF receptorsthan PDGF receptors (Duchek et al 2001, Heino et al. 2001;Cho et al. 2002). Our results are in agreement with thissuggestion, yet bootstrap value was not significant for thisbranch (39%).

Class III and class V sequences from chordates groupedin two clusters with high bootstrap values (100% and 94%,respectively), suggesting independent evolution from a com-mon ancestor (see above). The divergence occurred after theprotostomes/deuterostomes separation by duplication due tothe absence of protostome sequences in classes III and V.Additional duplications occurred before the tetrapods/tele-osts divergence, as suggested by the fact that all sampledtetrapods and fishes have the same class III and class V rep-ertoires. Within class V, vertebrate RTK sequences groupedinto VGR1, VGR2, and VGR3 subgroups, consistent withprevious classification (Shibuya 2002). Interestingly,VGR_CIOIN, from the ascidian C. intestinalis, segregated

with class V sequences, which suggests that classes IIIand V diverged before the apparition of urochordates. Incontrast, no C. intestinalis class III sequence could be iden-tified. Collectively, the data suggest that VGR_CIOIN isa class V RTK and that its paralog (class III) was lost inthe ascidian lineage after duplication of the class III/classV ancestral gene (Leveugle et al. 2004).

Duplication and Loss of Genes During Evolution ofClass III and Class V RTKs

Unlike class IV whose diversification could be ex-plained by two successive rounds of duplications, diversi-fication of classes III and V after the chordates/urochordatessplit follows a more complex scenario. In humans, cluster-ing of PGDS (now referred to as PDGF receptor-alpha),KIT (the cellular homolog of v-kit Hardy-Zuckerman 4feline sarcoma viral oncogene), and VGR (VEGF receptor)-2on chromosome 4q; PGDR (now referred as to PGDFreceptor-beta), CSF-1R (colony-stimulating-factor 1 recep-tor, or Fms), and VGR3 on chromosome 5q; and FLT3(Fms-like tyrosine kinase 3) and VGR1 on chromosome13q suggested evolution of these eight genes from a com-mon ancestor gene through several duplications and spe-cific gene losses (Rousset et al. 1995; Shibuya 2002;J. Gu and X. Gu 2003). Our result clearly supports theexistence of an ancestor cluster of three genes that gaverise to PDGF receptors (PGDS, PGDR), FMS/KIT/FLT3

FIG. 2.—Refined molecular phylogeny of class IV (A) and class V (B) RTK. Maximum likelihood trees were generated from conserved domains ofclass IV and class V sequences using the PhyML reconstruction method. Trees were rooted with corresponding sequences from Ciona intestinalis andHalocynthia roretzi for class IV or from C. intestinalis for class V.

Evolution of RTK with IG-Like Domains 1237

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from

receptors, and VEGF receptors (VGR1, VGR2, VGR3),respectively. According to this scenario, cis duplication ofan ancestral gene (VEPG like) first generated precursors ofclasses V and III. Then, class III precursor underwent anothercis duplication event leading to a putative precursor cluster(fig. 3A). As discussed above, the first cis duplication (D1)occurred before the chordate/urochordate split, whereasthe second one (D2) occurred after.

The phylogeny of class III RTK supports duplicationof a chromosome fragment leading to IG-like RTK clusterson human chromosomes 4q12 (KIT, PDGS, and VGR2)and 5qter (FMS, PDGR, and VGR3). The phylogeny ofclass V RTK is also in agreement with this possibility(fig. 1). However, the bootstrap values determining this sce-nario are low. We attempted to clarify the class V RTK phy-logeny by aligning only class V sequences, which resultedin more sites. Then, higher and significant bootstrap valueswere obtained for each tree branch, supporting commonorigin of human VGR2 and VGR3 (fig. 2B). Consequently,two models may be proposed for the evolution of the pu-tative ancestor cluster (fig. 3B). Both involve two trans du-plications (D3 and D4), generating the IG-like RTK clustersfound on chromosomes 4 and 5. This evolution scheme isconsistent with the location of both clusters in paralogonsbetween human chromosomes 4 and 5 (Lundin 1993; Perez2003). Concerning Flt3/VGR1 genes, a parsimonious sce-

nario involves loss of a class III member (fig. 3B, left),whereas a scenario consistent with the 2R hypothesis in-volves a second round of duplication followed by loss ofone cluster and loss of a class III member (missing sixthmember) in the remaining cluster (fig. 3B, right). In orderto test these hypotheses, we reasoned that some of the genesaround class III and class V RTKs in 4q12, 5q33, and13q12.2/3 might be present elsewhere in the human ge-nome, thereby marking a putative paralogy region with13q12.2/3. Interestingly, the three clusters include ParaHoxgenes (GSH/Cdx) in 5# of class III RTK genes and twogroups of related sequences in 3# of class V RTK genes,confirming the paralogy of these chromosomal regions(Minguillon and Garcia-Fernandes 2003). Similarity searchwas performed with each sequence in the human genome(http://www.ensembl.org/index.html). Our results first con-firmed paralogy of 4q12, 5q33, and 13q12 (supplementaryfig. 1, Supplementary Material online). Interestingly, a fewgenes were found that significantly matched (E , 10�15)paralogous genes on the three chromosomes, but thesegenes were dispersed within the genome and did not enableto define a specific genomic fragment. Then, we consideredgenomic regions showing significant homologies with sev-eral genes found in the vicinity of class III and class VRTKs in 4q12, 5q33, and 13q12. This investigation pointedto possible homology between these regions and a region

FIG. 3.—Hypothetical scenario for the origin and divergence of class III and class V RTK. Cis duplications (D1 and D2) resulted in the sharedancestor of class III and class V RTK (A). Trans duplications resulted in class III and class V RTK diversification (B). Two models are presented: (1) on theleft, the second round of duplication (D4) was limited to a single gene cluster and (2) on the right, the two rounds of duplication (D3 and D4) were followedby gene loss.

1238 Grassot et al.

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from

found on chromosome 19, in 19q13 (supplementary fig. 1and supplementary table 1, Supplementary Material on-line). The HIF3A, GLTSCR1, and MYH14 genes foundin 19q13 identify a cluster similar to those found in 3#of class V RTK genes in the paralogous regions on chro-mosomes 4, 5, and 13. Interestingly, no RTK sequence wasfound in 19q13, nor paralogs to genomic-screened homeo-box or Cdx. Cdx and collagen type IV genes have beenfound to define a homology region between chromosome13q12/q34 and chromosome Xq13/q23 (Minguillon andGarcia-Fernandes 2003). Thus, remains of a paralogon be-tween chromosomes 4, 5, and 13 might be shared by chro-mosomes 19 and X in humans. In summary, the datasupport the hypothesis of two trans duplications accompa-nied by a chromosomal fragment loss (fig. 3B, right) to ex-plain the current localization of class III and V RTK genesin the human genome. A similar conclusion was reachedafter analyzing the chromosomal location of mouse classIII and class V RTK genes (data not shown).

Conclusions

Combined to data from the literature, the present studyenabled us to propose a comprehensive scenario of IG-likeRTK evolution (fig. 4). The main families of IG-like RTKemerged before the chordate/urochordate split. Further

chordate-specific duplication events resulted in diversifica-tion of the IG-like RTK family in agreement with the 2Rhypothesis. Whereas molecular phylogeny and chromo-some synteny provided strong evidence for a shared ances-try of class III and V RTKs, evolutionary relationshipsbetween class IV RTK and class III/V RTK need to be clar-ified. More RTK sequences from invertebrates are neededfor this, which should help to refine the evolutionary historyof IG-like RTK.

Supplementary Material

Supplementary figure 1 and table 1 are available atMolecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

Acknowledgments

We gratefully acknowledge IN2P3 for the computerresources. This work was supported by the Centre Nationalde la Recherche Scientifique, by a ‘‘Ligue Nationale Contrele Cancer’’ grant (program ‘‘Equipe Labellisee’’ for G.M.’sgroup), and by a fellowship from IFR 41 entitled: ‘‘Buildingspecific tools for bioinformatic studies of modular polypep-tide sequences.’’ J.G. was recipient of a fellowship from the

FIG. 4.—Proposed model of IG-like RTK evolution.

Evolution of RTK with IG-Like Domains 1239

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from

Ministere de l’Education Nationale, de la Recherche, et dela Technologie.

Literature Cited

Adams, M. D., S. E. Celniker, R. A. Holt et al. (192 co-authors).2000. The genome sequence of Drosophila melanogaster.Science 287:2185–2195.

Aguinaldo, A. M., J. M. Turbeville, L. S. Linford, M. C. Rivera,J. R. Garey, R. A. Raff, and J. A. Lake. 1997. Evidence fora clade of nematodes, arthropods and other moulting animals.Nature 387:489–493.

Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang,W. Miller, and D. J. Lipman. 1997. Gapped BLAST andPSI-BLAST: a new generation of protein database search pro-grams. Nucleic Acids Res. 25:3389–3402.

Bairoch, A., R. Apweiler, C. H. Wu et al. (15 co-authors). 2005.The universal protein resource (UniProt). Nucleic Acids Res.33:D154–D159.

Castresana, J. 2000. Selection of conserved blocks from multiplealignments for their use in phylogenetic analysis. Mol. Biol.Evol. 17:540–552.

Cho, N. K., L. Keyes, E. Johnson, J. Heller, L. Ryner, F. Karim,and M. A. Krasnow. 2002. Developmental control of blood cellmigration by theDrosophilaVEGFpathway.Cell108:865–876.

Coulier, F., P. Pontarotti, R. Roubin, H. Hartung, M. Goldfarb,and D. Birnbaum. 1997. Of worms and men: an evolutionaryperspective on the fibroblast growth factor (FGF) and FGFreceptor families. J. Mol. Evol. 44:43–56.

Drescher, U. 2002. Eph family functions from an evolutionaryperspective. Curr. Opin. Genet. Dev. 4:397–402.

Duchek, P., K. Somogyi, G. Jekely, S. Beccari, and P. Rorth.2001. Guidance of cell migration by the Drosophila PDGF/VEGF receptor. Cell 107:17–26.

Edgar, R. C. 2004. MUSCLE: multiple sequence alignment withhigh accuracy and high throughput. Nucleic Acids Res.32:1792–1797.

Felsenstein, J. 1989. PHYLIP—phylogeny inference package.Version 3.2. Cladistics 5:164–166.

Grassot, J., G. Mouchiroud, and G. Perriere. 2003. RTKdb:database of receptor tyrosine kinase. Nucleic Acids Res. 31:353–358.

Gu, J., and X. Gu. 2003. Natural history and functional divergenceof protein tyrosine kinases. Gene 317:49–57.

Guindon, S., and O. Gascuel. 2003. A simple, fast, and accuratealgorithm to estimate large phylogenies by maximum likeli-hood. Syst. Biol. 5:696–704.

Heino, T. I., T. Karpanen, G. Wahlstrom, M. Pulkkinen, U. Eriks-son, K. Alitalo, and C. Roos. 2001. The Drosophila VEGF re-ceptor homolog is expressed in hemocytes. Mech. Dev.109:69–77.

Holt, R. A., G. M. Subramanian, A. Halpern et al. (120 co-authors). 2002. The genome sequence of the malaria mosquitoAnopheles Gambiae. Science 298:129–149.

Hubbard, S. R., and J. E. Till. 2000. Protein tyrosine kinase struc-ture and function. Annu. Rev. Biochem. 69:373–398.

Hubbard, T., D. Andrews, M. Caccamo et al. (52 co-authors).2005. Ensembl 2005. Nucleic Acids Res. 33:D447–D453.

Jaaro, H., G. Beck, and S. G. Conticello. 2001. Evolving betterbrains: a need for neurotrophins? Trends Neurosci. 24:79–85.

Jones, D. T., W. R. Taylor, and J. M. Thornton. 1992. The rapidgeneration of mutation data matrices from protein sequences.Comput. Appl. Biosci. 8:275–282.

Kanz, C., P. Aldebert, N. Althorpe et al. (32 co-authors). 2005.The EMBL nucleotide sequence database. Nucleic AcidsRes. 33:D29–D33.

King, N., and S. B. Caroll. 2001. A receptor tyrosine kinase fromchoanoflagellates: molecular insights into early animal evolu-tion. Proc. Natl. Acad. Sci. USA 98:15032–15037.

Kostich, M., J. English, V. Madison, F. Gheyas, L. Wang, P. Qiu,J. Greene, and T. M. Laz. 2002. Human members of theeukaryotic protein kinase family. Genome Biol. 3:9.

Leveugle, M., K. Prat, C. Popovici, D. Birnbaum, and F. Coulier.2004. Phylogenetic analysis of Ciona intestinalis gene super-families supports the hypothesis of successive gene expan-sions. J. Mol. Evol. 58:168–181.

Lu, X., A. G. Borchers, C. Jolicoeur, H. Rayburn, J. C. Baker,and M. Tessier-Lavigne. 2004. PTK7/CCK-4 is a novel regu-lator of planar cell polarity in vertebrates. Nature 430:93–98.

Lundin, L. G. 1993. Evolution of the vertebrate genome asreflected in paralogous chromosomal regions in man and thehouse mouse. Genomics 16:1–19.

Miller, M. A., and R. E. Steele. 2000. Lemon encodes an unusualreceptor protein-tyrosine kinase expressed during gametogen-esis in Hydra. Dev. Biol. 224:286–298.

Minguillon, C., and J. Garcia-Fernandes. 2003. Genesis and evo-lution of the Evx and Mox genes and the extended Hox andParaHox gene clusters. Genome Biol. 4:R12.

Mossie, K., B. Jallal, F. Alves, I. Sures, G. D. Plowman, and A.Ullrich. 1995. Colon carcinoma kinase-4 defines a new sub-class of the receptor tyrosine kinase family. Oncogene16:2179–2184.

Muller, W. E., M. Kruse, B. Blumbach, A. Skorokhod, and I. M.Muller. 1999. Gene structure and function of tyrosine kinasesin the marine sponge Geodia cydonium: autapomorphic char-acters in Metazoa. Gene 238:179–193.

Ohno, S. 1970. Evolution by gene duplication. Springer-Verlag,New York.

Parichy, D. M., D. G. Ransom, B. Paw, L. I. Zon, and S. L. John-son. 2000. An orthologue of the kit-related gene fms is requiredfor development of neural crest-derived xanthophores anda subpopulation of adult melanocytes in the zebrafish, Daniorerio. Development 127:3031–3044.

Pebusque, M. J., F. Coulier, D. Birnbaum, and P. Pontarotti. 1998.Ancient large-scale genome duplications: phylogenetic andlinkage analyses shed light on chordate genome evolution.Mol. Biol. Evol. 15:1145–1159.

Perez, D. M. 2003. The evolutionarily triumphant G-protein-coupled-receptor. Mol. Pharmacol. 63:1202–1205.

Perriere, G., and M. Gouy. 1996. WWW-query: an on-line re-trieval system for biological sequence banks. Biochimie78:364–369.

Plowman, G. D., S. Sudarsanam, J. Bingham, D. Whyte, andT. Hunter. 1999. The protein kinases of Caenorhabditiselegans: a model for signal transduction in multicellularorganisms. Proc. Natl. Acad. Sci. USA 96:13603–13610.

Popovici, C., D. Isnardon, D. Birnbaum, and R. Roubin. 2002.Caenorhabditis elegans receptors related to mammalian vascu-lar endothelial growth factor receptors are expressed in neuralcells. Neurosci. Lett. 329:116–120.

Popovici, C., R. Roubin, F. Coulier, P. Pontarotti, and D. Birnbaum.1999. The family of Caenorhabditis elegans tyrosine kinasereceptors: similarities and differences with mammalianreceptors. Genome Res. 9:1026–1039.

Rikke, B. A., S. Murakami, and T. E. Johnson. 2000. Paralogy andorthology of tyrosine kinases that can extend the life span ofCaenorhabditis elegans. Mol. Biol. Evol. 17:671–683.

Robinson, D. R., Y. M. Wu, and S. F. Lin. 2000. The proteintyrosine kinase family of the human genome. Oncogene19:5548–5557.

Rousset, D., F. Agnes, P. Lachaume, C. Andre, and F. Galibert.1995. Molecular evolution of the genes encoding receptor

1240 Grassot et al.

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from

tyrosine kinase with immunoglobulinlike domains. J. Mol.Evol. 41:421–429.

Satou, Y., Y. Sasakura, L. Yamada, K. S. Imai, N. Satoh, andB. Degnan. 2003. A genomewide survey of developmentallyrelevant genes in Ciona intestinalis. V. Genes for receptortyrosine kinase pathway and notch signaling pathway. Dev.Genes Evol. 213:254–263.

Shibuya, M. 2002. Vascular endothelial growth factor receptorfamily genes: when did the three genes phylogenetically seg-regate? Biol. Chem. 383:1573–1579.

Suga, H., K. Katoh, and T. Miyata. 2001. Sponge homologsof vertebrate protein tyrosine kinases and frequent domainshufflings in the early evolution of animals before theparazoan-eumetazoan split. Gene 280:195–201.

Taylor, J. S., and H. Brinkmann. 2001. 2R or not 2R? TrendsGenet. 17:488–489.

Vicogne, J., J. P. Pin, V. Lardans, M. Capron, C. Noel, andC. Dissous. 2003. An unusual receptor tyrosine kinase ofSchistosoma mansoni contains a Venus flytrap module.Mol. Biochem. Parasitol. 126:51–62.

Wolfe, K. H. 2001. Yesterday’s polyploids and the mystery ofdiploidization. Nat. Rev. Genet. 2:333–341.

William Martin, Associate Editor

Accepted March 16, 2006

Evolution of RTK with IG-Like Domains 1241

at Univ M

assachusetts Healey L

ibrary on August 21, 2014

http://mbe.oxfordjournals.org/

Dow

nloaded from