59
Laboratoire d’immunogénétique moléculaire U A 3034 IFR30 U.A. 3034 IFR30

Laboratoire d’immunogénétique moléculaire U A 3034 … (GP) ABO Kell Fy Selective sweep Hitchhicking Fy ABO Sabeti et al. Scinces 2006: 312, 1614-1620 Excès d’allèles de hautes

Embed Size (px)

Citation preview

Laboratoire d’immunogénétique moléculaire U A 3034 IFR30U.A. 3034 IFR30

INTRODUCTIONLes gènes responsables des groupes sanguins sont par définition polymorphes

Dans de nombreux cas on a pu récemment démontrer que ce polymorphisme ne s’est pas maintenu par hasard mais est le fruit d’une sélectionsélection.

Les agents sélectifs le plus souvent invoqués sont ceux du paludisme(ABO GPA GPB GPC FY)(ABO, GPA GPB GPC, FY)

D’autres agents infectieux très variés peuvent être en cause :virus (Callicivirus) , bactéries (Choléra et Helicobacter pylori), us (Ca c us) , bacté es (C o é a et e cobacte py o ),

Dans d’autres cas, les causes de la sélection ne sont pas connues (Kell)

Les résultats du séquençage de génomes entiers (homme, chimpanzé, gorille, orang-outang, macaque et bien d’autres mammifères) ainsi que les résultats d’études massives du polymorphisme du génome humain ont renouvelé les outils permettant de rechercher les gènes soumis à sélection.

KA/KS (GP)

KellABO Kell

Fy

ABOSelective sweep Hitchhicking

Fy

ABOABO

Sabeti et al. Scinces 2006: 312, 1614-1620

Excès d’allèles de hautes fréquences

Extrême différence entre les populations : le cas particulier de l’allèle FY*0

ABO gene polymorphism is maintained by selection

1) S tt Pl t f N t lit T t St ti ti1) Scatter Plot of Neutrality Test Statistics in European- and African-Americans Genes (Akey et al. 2004)

2) Extended haplotype homozygosity plots around functional SNPs in the ABO locus(Fry et al 2008)(Fry et al. 2008)

3) A region of low FST region around the ABO gene (Fry et al 2008)(Fry et al. 2008)

4) Linkage disequilibrium (LD) around the ABO glycosyltransferase gene (Fry et al. 2008)(Fry et al. 2008)

5) Recombination in ABO gene darkens the reconstruction of ABO dynamics(Calafell et al. 2008)( )

AKEY et al 2004Figure 1. Scatter Plot of Neutrality Test Statistics in European- and African-Americans Genes th t i ll i i t ( 0 05) i E A i (EA) Af i A i (AA)

Scatter Plot of Neutrality Test Statistics in European and African-Americans Genes

AKEY et al. 2004 that are nominally signi.cant (p 0.05) in European-Americans (EA), African-Americans (AA), or both populations are denoted by red, blue, and green circles, respectively. Genes that are not signi.cant are shown as blac dots. Two-sided tests were used for Tajima’s D, Fu and Li’s D*, and Fu and Li’s F*, and a one-sided test was used for Fay and Wu’s H.

African

Tajima’s D Fu and Li’s DAfrican-Americans

ABO ABO

Kell

Kell

F d W ’ H (i l )

European

ABO

Kell

Fu and Li’s F Fay and Wu’s H (in p-value)

Kell

Kell

Extended haplotype homozygosity plots around functional SNPs in the ABO locus.

Figure 3. Extended haplotype homozygosity plots around functional SNPs in the ABO locus. Alleles that have risen rapidly in frequency because of recent positive or balancing selection (e.g. a partialrecent positive or balancing selection (e.g. a partial selective sweep) can be surrounded by a region of similar haplotypes that can extend for hundreds of kilobases (45,46). This occurs because recombination has had insufficient time to swap variation between the selected haplotype and othersvariation between the selected haplotype and others in the population. The decay of homozygosity (EHH) on phased haplotypes partitioned by the alleles of the functional ABO SNPs. (A) rs8176719, the frameshift mutation in exon 6 of the ABO gene, marking O haplotypes, which is associated with protection from severe malaria. (B) rs8176746, a nonsynonymous coding SNP in the N-terminal catalytic domain of ABO, one of the functional variants determining A/B glycosyltransferase activity.variants determining A/B glycosyltransferase activity. The lack of a pronounced EHH signal suggests that the balanced selection affecting variation at the ABO gene is longstanding.

Fry, A. E. et al. Hum. Mol. Genet. 2008 17:567-576; doi:10.1093/hmg/ddm331

A region of low FST region around the ABO gene.

Figure 4 A region of low FST region around theFigure 4. A region of low FST region around the ABO gene.(A) 400 Single SNP FST values for three HapMap populations (CEU, YRI and combined Asian) surrounding the ABO gene. FST drops to a level of 20 001 in an 85 SNPFST drops to a level of 20.001 in an 85 SNP window across the gene and the noncoding sequence _20–30 kb upstream. (B) Histogram representing an empirical distribution of FST determined by screening similarly sizeddetermined by screening similarly sized windows across chromosome 9. Three different window sizes were used based on either marker numbers (85 SNPs), genetic distance (0.054 cM) or physical distance (50 kb) only windowscM) or physical distance (50 kb), only windows containing more than two markers were included. By all three distributions, the region around the ABO locus is a relative outlier _99.5–99 9th centile for low FST99.9th centile for low FST.

Fry, A. E. et al. Hum. Mol. Genet. 2008 17:567-576; doi:10.1093/hmg/ddm331

Linkage disequilibrium (LD) around the ABO glycosyltransferase gene

F A E t l H M l G t 2008 17 567 576 d i 10 1093/h /dd 331

Copyright restrictions may apply.

Fry, A. E. et al. Hum. Mol. Genet. 2008 17:567-576; doi:10.1093/hmg/ddm331

• Figure 1 Linkage disequilibrium (LD) around the ABO• Figure 1. Linkage disequilibrium (LD) around the ABO glycosyltransferase gene. Strong LD exists between the four key functional SNPs. (A) Yoruba HapMap parent–offspring trios from Ibadan in Nigeria. (B) 1320 Gambian parent–offspring trios. Both of g ( ) p p gthese West-African populations have near perfect LD between the three nonsynonymous SNPs that differentiate A and B ABO alleles, and moderate but lower LD with the common deletion that generates the O allele (C) r2 values for the Yoruba population across the ABOthe O allele. (C) r2 values for the Yoruba population across the ABO gene. Phased genotypes (HapMap, July 2006) with additional genotyping for four functional polymorphisms in ABO. The ABO gene (total region shown chromosome 9, 24 kb, 133156822–g ( g , _ ,133180999, NCBI Build 35) is illustrated 30 to 50 with exons 6 and 7 on the left. The SNPs differentiating A and B blood groups are in a high LD block, and indicated by three arrows under exon 7. The frameshift mutation responsible for the O allele (arrow under exon 6)frameshift mutation responsible for the O allele (arrow under exon 6) is on the 30 edge of another high LD block.

GalSubstance H RGal

1->2UDP-Gal UDP-GalNac

Substance H R

Fuc

Type B Type AypUDP-Gal T

ypUDP-GalNac T

GalGal

1->3GalGal

Nac

1->3

1->2

Nac

1->2

B AFuc FucB A

O03 A BO03

O01 O02O03LeuArg

A BLeu MetGly Ala

266268

261 del -> frameshiftCOOH354

COOHCOOH117

N H 2N H 2

A1 B A2

176 Arg Gly Arg235 Gly Ser Gly

266 Leu Met Leu C O O H266 Leu Met Leu268 Gly Ala Gly

C O O H3 75

C O O H3 5 4

N H 2 N H 2

Répartion des allèles O dans le monde

Ov2

O03

O542

Répartion des allèles O dans le monde

O03O542Ov7Ov2

Ov7

Ov2

O03Ov7Ov2

others

others

Oo1

Oo2Oo1

O03O542

Ov7

O542

others

others

O01

O01others

Oo2

O542

O01

O02

Ov7

Ov2O02

(1) Phylogenetic network of human ABO alleles (those found from more than one individual)11

B1617,

Sequence 1 = #A101 Sequence 2 = #B101 Sequence 3 = #O01 Sequence 4 = #O02 Sequence 5 = #O1v_B_13 Sequence 6 = #O05

177

1623

5

222B group 1624

Sequence 6 #O05Sequence 7 = #Ov2 Sequence 8 = #Ov7 Sequence 9 = #O1v_tlse07 Sequence 10 = #O01_C467T Sequence 11 = #O03 S 12 #O1 G542A

5

A1/O1 groupRoot

Sequence 12 = #O1v_G542A Sequence 13 = #Ov6 Sequence 14 = #Ov_tlseO1 Sequence 15 = #Ov_tlse02 Sequence 16 = #Ov_tlse08 Sequence 17 = #Ov7 bis

9

3927

17477958, 298, 589,

929, 1034, 1093698

10

151478, 1524, 1751

1

7331 378q _

Sequence 18 = #Ov_tlse20

6

3

61 nuc. dif.

224, 636

15

128816

331,378

1154295del 411 725 931uc. d .

1467 159217

11541156

18

295del, 411, 725, 931

4 8

15921650

1502335

369, 860

13631416

14

8bp-ins

112O2 group12136313

E6 Exon 7intron 6

AB O 1 O1v

Tlse 14

tlse 51Ov7

Tlse52

E6 Exon 7intron 6

Tlse 13

O05

O06 or o1-o1v

O03

o1v-a1

*

O19 R102 O or A very weak

Ax03 (Ax2)

Ax02 (Ax3)

R101 or A204 or b-(b/o1v)

A x

A xAx03 (Ax2)

Ax5

Ax4

A

A x

A x

B108

R103-2

A112 A 1

B

A very weak

b-o1

0% 20% 40% 60% 80% 100%

A 1

O47

O02

O01

O09O09

A201

A101

B101

Chimpanzee

A201

A101

O09

O01O01

O02

B O47A101

A201

O09O09

O02A201

B B

A101

A101A201

A201A101

A201 O09

O01 O01 O01

B

O01

O02O02

1 kb 6 kb 6 kb 20 kb 20 kb 23 7 kbkb1 kb-6 kb2.0 My

6 kb-20 kb2.8 My

20 kb-23.7 kbkb4.8 My

Fig. 1 Nucleotide diversity along the ABO gene.Diversity is plotted for windows of 1,000 bp moved 25 bp. On the X-axis, from top to bottom: i) a cartoon of the ABO gene with exons numberedfrom top to bottom: i) a cartoon of the ABO gene, with exons numbered and regions not sequenced by Seattle SNPs as striped boxes; ii) a physical scale; iii) position of the 214 segregating sites (multiples of 10 only)only)

Fig. 6 Tajima’s D along the ABO sequence. Window length and X-axis as in Fig. 1. Statistically signiWcant peaks are marked with black bars above themthem

A101

0 3 A201

O47

O02A201

A101

O01 Homo 1.15 O09

0.3Δ261

Δ261

A201O01

O09

O01

O09

A201

O02Δ261sapiens

3.5 My

2.5

0.26

O09

A201

A101

O02

A

A1

BB101

B101

O47

Pan troglodytesA1

Ox

? A2

Odel

Pan paniscusA

Odel

p

B GorillaB Gorilla

A1 -3 (A103)A2

O3A102

cis ABO1varO1var

Ax1 O202O201

O203O101

A101A3-1O1var(O103)

A1-4 (A104)

31%

0.001

Human allelesA 4 (A104)

O2B(A)

B-3 (B103)B-2 (B102)

B3-1

Chimp. 4 (Patr. 2)

Chimp 3 (Patr 1)Chimp. 5 (Patr. 3)

B-1 (B101)

10%0.001

30%0.001100%

0.006

72% Chimp. 3 (Patr. 1)Chimp. Odel

Chimp. Ox

Chimp. 2Chimp. A1

Chimp. A2

O 3O 1

72%

0.009

87%

Chimpanzee alleles

O tOrang-3 (Popy 1)

Orang-1Orang-2

Gorilla

Gogo-1,3,4Gogo-2Gogo-5

87%0.006

82%

0.009

Orangutan

GorillaGogo 5

Rh. O (Rh427)Rh. O (Rh383)Rh*B101

Rh*B102Cy*B102

Cy*B101Rh*B103

100%

100%0.009 Macaque alleles B & O

Baboon OBaboon B

Baboon ABaboon A

Rh*A101Cy*O101

Baboon B

0.019

56%

0.005

0.001Cy O101 Cy*A103Cy O (J97)

Crab-eating mac.Cy*A102

Macaques, alleles A & O

HOSA A101Tlse02C768AMizuno G801TOvvarT767CO03O39O40Hosa O02Ov6Patr A2 CH580 AY138473Patr Odel AY138477 APatr Odel AY138477Patr Ox AF071831Patr Ox2 AY138474Popy AOrang ABO028Orang2 kominatoPapa AB041759Patr A1 CH572 AY138471

Cebus apela CA8 A

A

Cebus apela CA8 A

Popy BGogo AY138472Hosa AF134414 B101Mafa O101 AF052081Mamu A101 AF052080Gelada A 1 2 20Babouin C72 A1Babouin C72 A2

B

A ou OBabouin C72 A2Paha A AF019416Paha O AF019418Mafu B AB041527Mafu B AB041529Mafa B101 AF052082Mafu B AB041525Mamu B AF052086Mafa B102 AF052083Gelada B 1 2 20

BGelada B 1 2 20Mafa B AF100984Mafu B AB041528Mamu B AF052085Pahy B AF019417Marmoset 85 ASaimiri scireus C293 ASaoe AY091958Saimiri bol SB2S i i i i C293 B B

ASaimiri scireus C293 BLemur FA942 BLemur LFA34 BGalago319 BGal320Gal318Galago1 B

BB

0.000.010.020.030.040.050.060.07

Allèle A des anthropoïdesAllèle A des anthropoïdes

Allèle A des NWM

Allèle A des OWM

Allèles B de tous les singesg

Allèl B d i iAllèles B des prosimiens

aaaaaTaaaTCaaaaTaaaTaaTaaaTaaaaaaCaaCaaaaaaaaaaTaaCaaa

aGaaaTaaaTaaaaaTaaaTaaTaaaTaaaaaaGaaGaaaaaGaaaaTaaaaaa

C T T T T C T C C T TaaaaCaaaTaaaaaTaaaTaaaaTaaCaaTaaaCaaCaaaaaaaTaaaaaaTaa

aaaaaaaaaTaaaaTaaaTaaaaTaaaaaTaaaGaaGaaaaaaaTaaaaGaTa

allele R BF

allele V BF

allele R BC

allele V BCallele V BC

Les échanges inter-alléliques : le crossing-over

La conversion géniqueLa conversion génique

Echange non conservatifEchange non conservatif

Echange conservatif

allele R BF

allele rec2 Rallele rec2 R

allele V BF

allele rec1 V

ll l R BCallele R BC

allele V BC

Les échanges inter-alléliques génèrent des familles d’allèles recombinants

A terme, parmi tous les allèles recombinants, un seul va persister (ou tout du moins dominer par sa fréquence tous les autres).

Sans avantage sélectif, un système bi-allèlique sera instable. L’un des deux allèles disparaitra.

La sélection équilibrée (balancing selection) est requise pour le maintien du l hi d t è l é i dpolymorphisme sur de très longues périodes.

Le polymorphisme A ou B est

L’apparition d’un allèle muet O permet de maintenir le polymorphisme A/B en diminuant la fréquence deA ou B est

favorable mais le phénotype AB est défavorable.

le polymorphisme A/B en diminuant la fréquence de individus AB qui ne produisent ni anti-A ni anti-B.Il entraîne l’apparition d’un nouveau phénotype O qui produit des anti-A et des anti-B.

Deux allèles A et B Trois allèles A B et O

0,6

p

Deux allèles A et B

A=0.5

Trois allèles A , B et O

A=0.1666 0,4

0,5Sé

B=0.5 B=0.1666O=0.6666

0,2

0,3

A/A=0.25=A

B/B 0 25 B

A/A=0.027 A=0.25A/O=0.22

0,1

0,2

B/B=0.25=B

AB=0.50=AB

B/B=0.027 B=0.25B/O=0.22AB=0.0.05 AB=0.05OO 0 44 O 0 44

0

A = B AB OrienOO= 0.44 O=0.44

Anti-AAnti-B

Le gène ABO n’est pas le seul gène de glycosyltransférase à avoir subit la pression de sélection liée aux agents infectieux.à avoir subit la pression de sélection liée aux agents infectieux.

Deux autres gènes, désormais célèbres, ont été inactivés chez certains primates dont l’hommesp

L’hydroxylase qui transforme l’acide acétyl-neuraminique en acide glycolyl-neuraminique (CMP-NeuAc hydroxylase).

L’alpha 1-3 galactosyltransferase

D’autres gènes présent chez l’homme sont absents chez d’autres primates:

Le gène Lewis est absent chez les singes du nouveau mondeL i d d ’ i t l b t HLes singes du nouveau monde n’expriemtn pas la substance H

Sur les cellules endothéliales

Acide Acetyl-neuraminique (N 5A )(Neu5Ac)

hydroxylase

Acide glycolyl neuraminiqueAcide glycolyl-neuraminique (Neu5Gc)

CMP-NeuAc hydroxylase

Takahata and colleagues described how the deletion occurred. They proposed a molecular mechanism of Alu-mediated replacement based on th fi di th t i t i i 92 b d Al S l t i ththe finding that a region containing a 92-bp exon and an AluSq element in the hydroxylase gene is intact in all nonhuman primates examined, and the same region in the human genome is replaced by an AluY element that was disseminated at least one million years agodisseminated at least one million years ago.

Acide Acetyl-neuraminique (Neu5Ac)

Acide Acetyl-neuraminique (Neu5Ac) Expression on Human GYPA Markedly Reduces Binding ofMarkedly Reduces Binding ofPfEBA-175.

It suggests that P. falciparum emerged through selective evolution of itsgg p g gEBA-175, toward preferentially recognizing the Neu5Ac-rich erythrocytes of humans (i.e., P. falciparum evolved from a strain of ancestral P. reichenowi that adapted to this radical change in thehuman ‘‘sialome’’).

The 3D structure of PfEBA-175 in complex with 3-sialyllactose was published (62). Notably, amino acids that contact glycans in two of the binding sites are divergent in PrEBA-175. (N.H. Tolia, E.J. Enemark, B.K.L. Sim and L. Joshua-Tor this issue, Cell 122 (2005), pp. 183–193)

Copyright ©2007 American Society of Hematology. Copyright restrictions may apply.

Cserti, C. M. et al. Blood 2007;110:2250-2258

Galili

HosaVarki

Aotus

Lewis ALewis Aendothelium H type 2

Human and AotusPfEBA175

P. falciparum

glycan-binding

PrEBA175P reichenowi

ChimpanzeeP. reichenowi

The fact that P.

Differences between PfEBA175 and PrEBA175at glycan-binding sites 3–6 are of particular

The fact that P. falciparum malaria can cause severe disease in some species of the

interest because it has recently been suggested that sialic acid recognition has a role in determining the host specificity of P. falciparumand P reichenowi neither of which – despite

spec es o t edistantly related New World Aotus monkey can also be explained by the expression of and P. reichenowi, neither of which despite

their genetic similarity – can infect the hosts of the other efficiently [13]. The observed differences in the glycan-binding pockets of

y pNeu5Ac on Aotus monkeys erythrocytes.

PfEBA175 and PrEBA175 are consistent with this theory.

Thus, even if the GYPA sequence differs from that of humans, the overall aspect of the O-glycans and their terminal Neu5Ac Sias decorating the protein must be similar enough in both species to allow the infection.similar enough in both species to allow the infection.

A study comparing 280 genes among Old World primates showed that the GYPA gene exhibited the strongest evidence for rapid evolution (30) likely reflectinggene exhibited the strongest evidence for rapid evolution (30), likely reflecting strong malarial parasite-mediated selection pressure. PfEBA-175-RII also seems to be evolving rapidly (30). It is reasonable to link both phenomena, because the interaction between these two proteins seems to be a key stepbecause the interaction between these two proteins seems to be a key step affecting the efficiency of malaria propagation. A scenario wherein PfEBA-175 is rapidly evolving to track changes in GYPA, which, in turn, is rapidly evolving to escape from the former represents a classical example for the ‘‘red-queenescape from the former, represents a classical example for the red queen effect’’ operating on protein– glycan recognition (52).

Rapidly Evolving Genes in Human : The Glycophorins and Their Possible Role in Evading Malaria ParasitesPossible Role in Evading Malaria ParasitesHurng-Yi Wang,* Hua Tang, C.-K. James Shen, and Chung-I WuMol. Biol. Evol. 20(11):1795–1804. 2003

GPA

KA/KSKA/KS

KA

KS

CONCLUSION

Le molécules exprimées à la surface des hématies sont prises pourLe molécules exprimées à la surface des hématies sont prises pour cibles par les agents infectieux pour pénétrer dans les cellules.

De nombreux gènes qui responsables de l’expression de cesDe nombreux gènes qui responsables de l expression de ces molécules à la surface des hématies ont été contraints d’évoluer pour diminuer la sensibilité de l’hôte aux agents pathogènes.

Les agents infectieux ont eux même du évoluer pour contrecarrer les nouvelles défenses de l’hôte (théorie de la Reine rouge).

Cette course évolutive permanente de l’hôte et du parasite ont aboutit au maintien du polymorphisme chez les primates.

Les recombinaisons entre allèles rendent illusoire la reconstruction précise des scénarios évolutifs.

Fig. 3 Divergence between O01 and O02 alleles, expressed as Dxy (£10¡3). X-axis as in Fig. 1. The arrow indicates the position of _261

Delta 261

Correspondence between the ABO O allele nomenclature as in the “Blood group antigen gene mutation database” and as in Roubinet et al.

BGMUT Roubinet et al.O09 Ov2

Blood group antigen gene mutation database and as in Roubinet et al. [10]

O09 OO11 O1vG542AO12 Ov6

O21 O01C467TO26 Ovartlse02O27 Ovartlse04O28 Ovartlse07O29 Ovartlse08O29 Ovartlse08O31 Ovartlse03O33 Ovartlse11O34 Ovartlse01O44 Ov7.2O46 O l 52O46 Ovartlse52O47 Ovartlse20

Population N S k H π

Af i A i 48 207 39 0 9867 29

Lineage Afr. Am. Eur. Am.

O47 4 (0 083)African Americans 48 207 39 0.9867 29

European Americans 46 161 28 0.9623 26

Total 94 214 61 0.9838 28

O47 4 (0.083) ---

O02 8 (0.167) 11 (0.239)

O01 8 (0.167) 21 (0.457)

O09 11 (0 229)Table 1. Summary statistics of Seattle SNPs ABO sequences. N, number of chromosomes; S, number of segregating sites; k, number of different haplotypes; H, haplotype diversity; π, nucleotide diversity (×10-4)

O09 11 (0.229) ---

A101 4 (0.083) 7 (0.152)

A201 2 (0.042) 7 (0.152)

B101 11 (0 229) ---diversity (×10 4). B101 11 (0.229)

Table 2. Frequencies for the different ABO lineages in the Seattle SNP dataset. Each haplotype was allocated to the lineage it bore in the 5’ end of the sequence. See also Figure 2.N k S H π

Akans 136 15 42 0.868 ± 0.011 62

Berbers 78 9 33 0.656 ± 0.042 52

Basques 220 11 40 0.583 ± 0.026 51

Putien 94 2 19 0.491 ± 0.019 50 Table 3 Sequence variability parameters in OFujiou 86 2 19 0.506 ± 0.009 52

Cayapas 74 5 21 0.580 ± 0.034 52

Aymaras 126 5 22 0.563 ± 0.035 42

Table 3. Sequence variability parameters in Oalleles from Roubinet et al. (2004). N, sample size; k, number of different haplotypes. S, number of polymorphic sites; H, haplotype diversity; π, nucleotide diversity; D, Tajima’s

Total 814 23 51 0.666 ± 0.011 55 statistic; Fs, Fu and Li's F statistic.