Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Mesurer les associations proteacuteiques agrave proximiteacute in vivo en utilisant la compleacutementation de fragments proteacuteiques
Meacutemoire
Andreacutee-Egraveve Chreacutetien
Maicirctrise en biologie
Maicirctre egraves sciences (M Sc)
Queacutebec Canada
copy Andreacutee-Egraveve Chreacutetien 2017
Mesurer les associations proteacuteiques agrave proximiteacute in vivo en utilisant la compleacutementation de fragments proteacuteiques
Meacutemoire
Andreacutee-Egraveve Chreacutetien
Sous la direction de
Christian Landry directeur de recherche
III
Reacutesumeacute
Les interactions proteacuteine-proteacuteine (PPI) sont agrave la base du fonctionnement cellulaire de tous
les organismes Regroupeacutees en deux cateacutegories les meacutethodes pour eacutetudier les PPI permettent
soit drsquoidentifier les proteacuteines composant le complexe soit de deacuteterminer les relations entre
les proteacuteines Il existe peu de meacutethodes hybrides permettant drsquoobtenir ces deux informations
et ces meacutethodes comportent plusieurs limitations Le but de ce projet eacutetait de deacutevelopper une
nouvelle meacutethode hybride en modifiant la compleacutementation de fragments proteacuteiques (DHFR
PCA) chez la levure Saccharomyces cerevisiae Le principe de la DHFR PCA repose sur
lrsquoassociation de deux fragments rapporteurs compleacutementaires en preacutesence drsquoune interaction
proteacuteine-proteacuteine Les fragments rapporteurs sont fusionneacutes aux proteacuteines via un connecteur
peptidique La longueur du connecteur limite la distance maximale agrave laquelle il est possible
de deacutetecter une interaction entre deux proteacuteines Notre hypothegravese eacutetait qursquoen augmentant la
longueur du connecteur nous serions en mesure de deacutetecter des interactions plus eacuteloigneacutees
Nous avons drsquoabord veacuterifieacute que lrsquoaugmentation de la longueur du connecteur permettait de
modifier notre capaciteacute agrave deacutetecter des interactions sans toutefois perdre la speacutecificiteacute de la
meacutethode De nouvelles interactions ont eacuteteacute deacutetecteacutees agrave lrsquointeacuterieur drsquoun mecircme complexe
proteacuteique et entre deux complexes Nous avons ensuite valideacute notre capaciteacute agrave mieux
disseacutequer lrsquoarchitecture des complexes proteacuteiques en approfondissant le cas de cinq
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de longueurs de connecteurs Enfin
nous avons confirmeacute que la meacutethode permettait effectivement de deacutetecter des interactions
entre proteacuteines plus distantes en comparant les reacutesultats obtenus aux distances calculeacutees agrave
partir des structures du proteacuteasome disponibles La variation apporteacutee agrave la DHFR PCA
permet de moduler la reacutesolution de lrsquoeacutetude des PPI et ainsi de mieux deacutefinir lrsquoarchitecture
des complexes proteacuteiques
IV
Abstract
Protein-protein interactions (PPI) are central to all cellular processes in all organisms
Grouped in two categories methods to study PPI allow either to identify proteins composing
protein complexes or to determine relationships between proteins Only a few hybrid methods
can be used to obtain both of those informations and these methods present many limitations
The goal of this project was to develop a new hybrid method by modifying the Protein-
fragment complementation assay (DHFR PCA) in the yeast Saccharomyces cerevisiae
DHFR PCA is based on the association of two complementary reporter fragments in presence
of an interaction Both fragments are fused to proteins with a peptide linker Linker length
limits the maximal distance at which it is possible to detect an interaction between two
proteins Our hypothesis was that increased linker length would allow the detection of more
distant interactions We first verified if the augmentation of linker length modified our
capacity to detect interactions without losing specificity New interactions were detected
inside and between complexes Then we validated our capacity to better dissect protein
complexes architecture by studying five protein complexes with different linker length
combinations Finally we confirmed that the method allowed the detection of interactions
that were further in space by comparing our results with distances calculated with available
proteasome structures This variation of DHFR PCA allows to modulate the resolution of PPI
study and thus better define protein complexes architecture
V
Table des matiegraveres
Reacutesumeacute III
Abstract IV
Table des matiegraveres V
Liste des tableaux VII
Listes des figures VIII
Listes des abreacuteviations IX
Remerciements XI
Avant-propos XIII
Introduction geacuteneacuterale 1
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes
proteacuteiques suivie de la spectromeacutetrie de masse 4
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des
interactions proteacuteine-proteacuteine 9
16 Objectifs de recherche 9
Measuring proximate protein association in living cells using Protein-fragment complementation
assay (PCA) 11
Reacutesumeacute 11
Abstract 12
Introduction 13
Material and Methods 14
Yeast 14
Bacteria 15
Plasmid construction 15
Strain construction 16
Estimation of protein abundance 16
Protein-fragment complementation assays 17
VI
PCA images and statistical analyses 19
Analysis of protein distances within complexes 21
Results and discussion 22
Longer linkers increase signal-to-noise ratio in large-scale screens 22
PCA signal reflects the super-organization of protein complexes 23
Longer linkers allow detection of more distant proteins in complexes 25
Conclusion 26
Acknowledgements 26
Conclusion geacuteneacuterale 43
Bibliographie 46
VII
Liste des tableaux
Table S1A Description of the strains constructed and used for this study 30
Table S1B PCA data for global PCA experiment 30
Table S1C PCA data for intra-complexes experiment 30
Table S1D PCR primers used in this study 30
Table S2A Distances between C-termini calculated from molecular modeling 31
Table S2B Identity between each RNApol structures and the experimental sequences 32
Table S2C Identity between proteasome structure and the experimental sequence 34
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II
and III and proteasome structures 37
VIII
Listes des figures
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization of
protein complexes 27
Figure 2 Longer linkers allow for the detection of more distant proteins within complexes
29
Figure S1 Data related to the PCA experiments 40
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins 42
IX
Listes des abreacuteviations
Pourcentage
degC Degreacute Celsius
Aring Aringngstroumlm
ADN Acide deacutesoxyribonucleacuteique
Amp Ampicilline
ARNm Acide ribonucleacuteique messager
BioID laquo Proximity-dependent biotinylation raquo
ClonNAT Nourseacuteothricine
COG laquo Conserved oligomeric Golgi raquo
DHFR Dihydrofolate reacuteductase
DMSO Dimeacutethylsulfoxyde
F[12] Fragment 12 de la DHFR
F[3] Fragment 3 de la DHFR
FDR Valeur P corrigeacutee
FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes
g Gramme
Gly ou G Glycine
h Heure
HygB Hygromycine B
Is Score drsquointeraction
L Litre
Log Logarithme
M Molaire
Min Minute
mL Millilitre
mM Millimolaire
MS Spectromeacutetrie de masse
MSMS Spectromeacutetrie de masse en tandem
MTX Meacutethotrexate
MYTH laquo Membrane yeast two-hybrid raquo
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
Mesurer les associations proteacuteiques agrave proximiteacute in vivo en utilisant la compleacutementation de fragments proteacuteiques
Meacutemoire
Andreacutee-Egraveve Chreacutetien
Sous la direction de
Christian Landry directeur de recherche
III
Reacutesumeacute
Les interactions proteacuteine-proteacuteine (PPI) sont agrave la base du fonctionnement cellulaire de tous
les organismes Regroupeacutees en deux cateacutegories les meacutethodes pour eacutetudier les PPI permettent
soit drsquoidentifier les proteacuteines composant le complexe soit de deacuteterminer les relations entre
les proteacuteines Il existe peu de meacutethodes hybrides permettant drsquoobtenir ces deux informations
et ces meacutethodes comportent plusieurs limitations Le but de ce projet eacutetait de deacutevelopper une
nouvelle meacutethode hybride en modifiant la compleacutementation de fragments proteacuteiques (DHFR
PCA) chez la levure Saccharomyces cerevisiae Le principe de la DHFR PCA repose sur
lrsquoassociation de deux fragments rapporteurs compleacutementaires en preacutesence drsquoune interaction
proteacuteine-proteacuteine Les fragments rapporteurs sont fusionneacutes aux proteacuteines via un connecteur
peptidique La longueur du connecteur limite la distance maximale agrave laquelle il est possible
de deacutetecter une interaction entre deux proteacuteines Notre hypothegravese eacutetait qursquoen augmentant la
longueur du connecteur nous serions en mesure de deacutetecter des interactions plus eacuteloigneacutees
Nous avons drsquoabord veacuterifieacute que lrsquoaugmentation de la longueur du connecteur permettait de
modifier notre capaciteacute agrave deacutetecter des interactions sans toutefois perdre la speacutecificiteacute de la
meacutethode De nouvelles interactions ont eacuteteacute deacutetecteacutees agrave lrsquointeacuterieur drsquoun mecircme complexe
proteacuteique et entre deux complexes Nous avons ensuite valideacute notre capaciteacute agrave mieux
disseacutequer lrsquoarchitecture des complexes proteacuteiques en approfondissant le cas de cinq
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de longueurs de connecteurs Enfin
nous avons confirmeacute que la meacutethode permettait effectivement de deacutetecter des interactions
entre proteacuteines plus distantes en comparant les reacutesultats obtenus aux distances calculeacutees agrave
partir des structures du proteacuteasome disponibles La variation apporteacutee agrave la DHFR PCA
permet de moduler la reacutesolution de lrsquoeacutetude des PPI et ainsi de mieux deacutefinir lrsquoarchitecture
des complexes proteacuteiques
IV
Abstract
Protein-protein interactions (PPI) are central to all cellular processes in all organisms
Grouped in two categories methods to study PPI allow either to identify proteins composing
protein complexes or to determine relationships between proteins Only a few hybrid methods
can be used to obtain both of those informations and these methods present many limitations
The goal of this project was to develop a new hybrid method by modifying the Protein-
fragment complementation assay (DHFR PCA) in the yeast Saccharomyces cerevisiae
DHFR PCA is based on the association of two complementary reporter fragments in presence
of an interaction Both fragments are fused to proteins with a peptide linker Linker length
limits the maximal distance at which it is possible to detect an interaction between two
proteins Our hypothesis was that increased linker length would allow the detection of more
distant interactions We first verified if the augmentation of linker length modified our
capacity to detect interactions without losing specificity New interactions were detected
inside and between complexes Then we validated our capacity to better dissect protein
complexes architecture by studying five protein complexes with different linker length
combinations Finally we confirmed that the method allowed the detection of interactions
that were further in space by comparing our results with distances calculated with available
proteasome structures This variation of DHFR PCA allows to modulate the resolution of PPI
study and thus better define protein complexes architecture
V
Table des matiegraveres
Reacutesumeacute III
Abstract IV
Table des matiegraveres V
Liste des tableaux VII
Listes des figures VIII
Listes des abreacuteviations IX
Remerciements XI
Avant-propos XIII
Introduction geacuteneacuterale 1
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes
proteacuteiques suivie de la spectromeacutetrie de masse 4
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des
interactions proteacuteine-proteacuteine 9
16 Objectifs de recherche 9
Measuring proximate protein association in living cells using Protein-fragment complementation
assay (PCA) 11
Reacutesumeacute 11
Abstract 12
Introduction 13
Material and Methods 14
Yeast 14
Bacteria 15
Plasmid construction 15
Strain construction 16
Estimation of protein abundance 16
Protein-fragment complementation assays 17
VI
PCA images and statistical analyses 19
Analysis of protein distances within complexes 21
Results and discussion 22
Longer linkers increase signal-to-noise ratio in large-scale screens 22
PCA signal reflects the super-organization of protein complexes 23
Longer linkers allow detection of more distant proteins in complexes 25
Conclusion 26
Acknowledgements 26
Conclusion geacuteneacuterale 43
Bibliographie 46
VII
Liste des tableaux
Table S1A Description of the strains constructed and used for this study 30
Table S1B PCA data for global PCA experiment 30
Table S1C PCA data for intra-complexes experiment 30
Table S1D PCR primers used in this study 30
Table S2A Distances between C-termini calculated from molecular modeling 31
Table S2B Identity between each RNApol structures and the experimental sequences 32
Table S2C Identity between proteasome structure and the experimental sequence 34
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II
and III and proteasome structures 37
VIII
Listes des figures
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization of
protein complexes 27
Figure 2 Longer linkers allow for the detection of more distant proteins within complexes
29
Figure S1 Data related to the PCA experiments 40
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins 42
IX
Listes des abreacuteviations
Pourcentage
degC Degreacute Celsius
Aring Aringngstroumlm
ADN Acide deacutesoxyribonucleacuteique
Amp Ampicilline
ARNm Acide ribonucleacuteique messager
BioID laquo Proximity-dependent biotinylation raquo
ClonNAT Nourseacuteothricine
COG laquo Conserved oligomeric Golgi raquo
DHFR Dihydrofolate reacuteductase
DMSO Dimeacutethylsulfoxyde
F[12] Fragment 12 de la DHFR
F[3] Fragment 3 de la DHFR
FDR Valeur P corrigeacutee
FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes
g Gramme
Gly ou G Glycine
h Heure
HygB Hygromycine B
Is Score drsquointeraction
L Litre
Log Logarithme
M Molaire
Min Minute
mL Millilitre
mM Millimolaire
MS Spectromeacutetrie de masse
MSMS Spectromeacutetrie de masse en tandem
MTX Meacutethotrexate
MYTH laquo Membrane yeast two-hybrid raquo
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
III
Reacutesumeacute
Les interactions proteacuteine-proteacuteine (PPI) sont agrave la base du fonctionnement cellulaire de tous
les organismes Regroupeacutees en deux cateacutegories les meacutethodes pour eacutetudier les PPI permettent
soit drsquoidentifier les proteacuteines composant le complexe soit de deacuteterminer les relations entre
les proteacuteines Il existe peu de meacutethodes hybrides permettant drsquoobtenir ces deux informations
et ces meacutethodes comportent plusieurs limitations Le but de ce projet eacutetait de deacutevelopper une
nouvelle meacutethode hybride en modifiant la compleacutementation de fragments proteacuteiques (DHFR
PCA) chez la levure Saccharomyces cerevisiae Le principe de la DHFR PCA repose sur
lrsquoassociation de deux fragments rapporteurs compleacutementaires en preacutesence drsquoune interaction
proteacuteine-proteacuteine Les fragments rapporteurs sont fusionneacutes aux proteacuteines via un connecteur
peptidique La longueur du connecteur limite la distance maximale agrave laquelle il est possible
de deacutetecter une interaction entre deux proteacuteines Notre hypothegravese eacutetait qursquoen augmentant la
longueur du connecteur nous serions en mesure de deacutetecter des interactions plus eacuteloigneacutees
Nous avons drsquoabord veacuterifieacute que lrsquoaugmentation de la longueur du connecteur permettait de
modifier notre capaciteacute agrave deacutetecter des interactions sans toutefois perdre la speacutecificiteacute de la
meacutethode De nouvelles interactions ont eacuteteacute deacutetecteacutees agrave lrsquointeacuterieur drsquoun mecircme complexe
proteacuteique et entre deux complexes Nous avons ensuite valideacute notre capaciteacute agrave mieux
disseacutequer lrsquoarchitecture des complexes proteacuteiques en approfondissant le cas de cinq
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de longueurs de connecteurs Enfin
nous avons confirmeacute que la meacutethode permettait effectivement de deacutetecter des interactions
entre proteacuteines plus distantes en comparant les reacutesultats obtenus aux distances calculeacutees agrave
partir des structures du proteacuteasome disponibles La variation apporteacutee agrave la DHFR PCA
permet de moduler la reacutesolution de lrsquoeacutetude des PPI et ainsi de mieux deacutefinir lrsquoarchitecture
des complexes proteacuteiques
IV
Abstract
Protein-protein interactions (PPI) are central to all cellular processes in all organisms
Grouped in two categories methods to study PPI allow either to identify proteins composing
protein complexes or to determine relationships between proteins Only a few hybrid methods
can be used to obtain both of those informations and these methods present many limitations
The goal of this project was to develop a new hybrid method by modifying the Protein-
fragment complementation assay (DHFR PCA) in the yeast Saccharomyces cerevisiae
DHFR PCA is based on the association of two complementary reporter fragments in presence
of an interaction Both fragments are fused to proteins with a peptide linker Linker length
limits the maximal distance at which it is possible to detect an interaction between two
proteins Our hypothesis was that increased linker length would allow the detection of more
distant interactions We first verified if the augmentation of linker length modified our
capacity to detect interactions without losing specificity New interactions were detected
inside and between complexes Then we validated our capacity to better dissect protein
complexes architecture by studying five protein complexes with different linker length
combinations Finally we confirmed that the method allowed the detection of interactions
that were further in space by comparing our results with distances calculated with available
proteasome structures This variation of DHFR PCA allows to modulate the resolution of PPI
study and thus better define protein complexes architecture
V
Table des matiegraveres
Reacutesumeacute III
Abstract IV
Table des matiegraveres V
Liste des tableaux VII
Listes des figures VIII
Listes des abreacuteviations IX
Remerciements XI
Avant-propos XIII
Introduction geacuteneacuterale 1
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes
proteacuteiques suivie de la spectromeacutetrie de masse 4
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des
interactions proteacuteine-proteacuteine 9
16 Objectifs de recherche 9
Measuring proximate protein association in living cells using Protein-fragment complementation
assay (PCA) 11
Reacutesumeacute 11
Abstract 12
Introduction 13
Material and Methods 14
Yeast 14
Bacteria 15
Plasmid construction 15
Strain construction 16
Estimation of protein abundance 16
Protein-fragment complementation assays 17
VI
PCA images and statistical analyses 19
Analysis of protein distances within complexes 21
Results and discussion 22
Longer linkers increase signal-to-noise ratio in large-scale screens 22
PCA signal reflects the super-organization of protein complexes 23
Longer linkers allow detection of more distant proteins in complexes 25
Conclusion 26
Acknowledgements 26
Conclusion geacuteneacuterale 43
Bibliographie 46
VII
Liste des tableaux
Table S1A Description of the strains constructed and used for this study 30
Table S1B PCA data for global PCA experiment 30
Table S1C PCA data for intra-complexes experiment 30
Table S1D PCR primers used in this study 30
Table S2A Distances between C-termini calculated from molecular modeling 31
Table S2B Identity between each RNApol structures and the experimental sequences 32
Table S2C Identity between proteasome structure and the experimental sequence 34
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II
and III and proteasome structures 37
VIII
Listes des figures
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization of
protein complexes 27
Figure 2 Longer linkers allow for the detection of more distant proteins within complexes
29
Figure S1 Data related to the PCA experiments 40
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins 42
IX
Listes des abreacuteviations
Pourcentage
degC Degreacute Celsius
Aring Aringngstroumlm
ADN Acide deacutesoxyribonucleacuteique
Amp Ampicilline
ARNm Acide ribonucleacuteique messager
BioID laquo Proximity-dependent biotinylation raquo
ClonNAT Nourseacuteothricine
COG laquo Conserved oligomeric Golgi raquo
DHFR Dihydrofolate reacuteductase
DMSO Dimeacutethylsulfoxyde
F[12] Fragment 12 de la DHFR
F[3] Fragment 3 de la DHFR
FDR Valeur P corrigeacutee
FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes
g Gramme
Gly ou G Glycine
h Heure
HygB Hygromycine B
Is Score drsquointeraction
L Litre
Log Logarithme
M Molaire
Min Minute
mL Millilitre
mM Millimolaire
MS Spectromeacutetrie de masse
MSMS Spectromeacutetrie de masse en tandem
MTX Meacutethotrexate
MYTH laquo Membrane yeast two-hybrid raquo
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
IV
Abstract
Protein-protein interactions (PPI) are central to all cellular processes in all organisms
Grouped in two categories methods to study PPI allow either to identify proteins composing
protein complexes or to determine relationships between proteins Only a few hybrid methods
can be used to obtain both of those informations and these methods present many limitations
The goal of this project was to develop a new hybrid method by modifying the Protein-
fragment complementation assay (DHFR PCA) in the yeast Saccharomyces cerevisiae
DHFR PCA is based on the association of two complementary reporter fragments in presence
of an interaction Both fragments are fused to proteins with a peptide linker Linker length
limits the maximal distance at which it is possible to detect an interaction between two
proteins Our hypothesis was that increased linker length would allow the detection of more
distant interactions We first verified if the augmentation of linker length modified our
capacity to detect interactions without losing specificity New interactions were detected
inside and between complexes Then we validated our capacity to better dissect protein
complexes architecture by studying five protein complexes with different linker length
combinations Finally we confirmed that the method allowed the detection of interactions
that were further in space by comparing our results with distances calculated with available
proteasome structures This variation of DHFR PCA allows to modulate the resolution of PPI
study and thus better define protein complexes architecture
V
Table des matiegraveres
Reacutesumeacute III
Abstract IV
Table des matiegraveres V
Liste des tableaux VII
Listes des figures VIII
Listes des abreacuteviations IX
Remerciements XI
Avant-propos XIII
Introduction geacuteneacuterale 1
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes
proteacuteiques suivie de la spectromeacutetrie de masse 4
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des
interactions proteacuteine-proteacuteine 9
16 Objectifs de recherche 9
Measuring proximate protein association in living cells using Protein-fragment complementation
assay (PCA) 11
Reacutesumeacute 11
Abstract 12
Introduction 13
Material and Methods 14
Yeast 14
Bacteria 15
Plasmid construction 15
Strain construction 16
Estimation of protein abundance 16
Protein-fragment complementation assays 17
VI
PCA images and statistical analyses 19
Analysis of protein distances within complexes 21
Results and discussion 22
Longer linkers increase signal-to-noise ratio in large-scale screens 22
PCA signal reflects the super-organization of protein complexes 23
Longer linkers allow detection of more distant proteins in complexes 25
Conclusion 26
Acknowledgements 26
Conclusion geacuteneacuterale 43
Bibliographie 46
VII
Liste des tableaux
Table S1A Description of the strains constructed and used for this study 30
Table S1B PCA data for global PCA experiment 30
Table S1C PCA data for intra-complexes experiment 30
Table S1D PCR primers used in this study 30
Table S2A Distances between C-termini calculated from molecular modeling 31
Table S2B Identity between each RNApol structures and the experimental sequences 32
Table S2C Identity between proteasome structure and the experimental sequence 34
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II
and III and proteasome structures 37
VIII
Listes des figures
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization of
protein complexes 27
Figure 2 Longer linkers allow for the detection of more distant proteins within complexes
29
Figure S1 Data related to the PCA experiments 40
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins 42
IX
Listes des abreacuteviations
Pourcentage
degC Degreacute Celsius
Aring Aringngstroumlm
ADN Acide deacutesoxyribonucleacuteique
Amp Ampicilline
ARNm Acide ribonucleacuteique messager
BioID laquo Proximity-dependent biotinylation raquo
ClonNAT Nourseacuteothricine
COG laquo Conserved oligomeric Golgi raquo
DHFR Dihydrofolate reacuteductase
DMSO Dimeacutethylsulfoxyde
F[12] Fragment 12 de la DHFR
F[3] Fragment 3 de la DHFR
FDR Valeur P corrigeacutee
FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes
g Gramme
Gly ou G Glycine
h Heure
HygB Hygromycine B
Is Score drsquointeraction
L Litre
Log Logarithme
M Molaire
Min Minute
mL Millilitre
mM Millimolaire
MS Spectromeacutetrie de masse
MSMS Spectromeacutetrie de masse en tandem
MTX Meacutethotrexate
MYTH laquo Membrane yeast two-hybrid raquo
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
V
Table des matiegraveres
Reacutesumeacute III
Abstract IV
Table des matiegraveres V
Liste des tableaux VII
Listes des figures VIII
Listes des abreacuteviations IX
Remerciements XI
Avant-propos XIII
Introduction geacuteneacuterale 1
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes
proteacuteiques suivie de la spectromeacutetrie de masse 4
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des
interactions proteacuteine-proteacuteine 9
16 Objectifs de recherche 9
Measuring proximate protein association in living cells using Protein-fragment complementation
assay (PCA) 11
Reacutesumeacute 11
Abstract 12
Introduction 13
Material and Methods 14
Yeast 14
Bacteria 15
Plasmid construction 15
Strain construction 16
Estimation of protein abundance 16
Protein-fragment complementation assays 17
VI
PCA images and statistical analyses 19
Analysis of protein distances within complexes 21
Results and discussion 22
Longer linkers increase signal-to-noise ratio in large-scale screens 22
PCA signal reflects the super-organization of protein complexes 23
Longer linkers allow detection of more distant proteins in complexes 25
Conclusion 26
Acknowledgements 26
Conclusion geacuteneacuterale 43
Bibliographie 46
VII
Liste des tableaux
Table S1A Description of the strains constructed and used for this study 30
Table S1B PCA data for global PCA experiment 30
Table S1C PCA data for intra-complexes experiment 30
Table S1D PCR primers used in this study 30
Table S2A Distances between C-termini calculated from molecular modeling 31
Table S2B Identity between each RNApol structures and the experimental sequences 32
Table S2C Identity between proteasome structure and the experimental sequence 34
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II
and III and proteasome structures 37
VIII
Listes des figures
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization of
protein complexes 27
Figure 2 Longer linkers allow for the detection of more distant proteins within complexes
29
Figure S1 Data related to the PCA experiments 40
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins 42
IX
Listes des abreacuteviations
Pourcentage
degC Degreacute Celsius
Aring Aringngstroumlm
ADN Acide deacutesoxyribonucleacuteique
Amp Ampicilline
ARNm Acide ribonucleacuteique messager
BioID laquo Proximity-dependent biotinylation raquo
ClonNAT Nourseacuteothricine
COG laquo Conserved oligomeric Golgi raquo
DHFR Dihydrofolate reacuteductase
DMSO Dimeacutethylsulfoxyde
F[12] Fragment 12 de la DHFR
F[3] Fragment 3 de la DHFR
FDR Valeur P corrigeacutee
FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes
g Gramme
Gly ou G Glycine
h Heure
HygB Hygromycine B
Is Score drsquointeraction
L Litre
Log Logarithme
M Molaire
Min Minute
mL Millilitre
mM Millimolaire
MS Spectromeacutetrie de masse
MSMS Spectromeacutetrie de masse en tandem
MTX Meacutethotrexate
MYTH laquo Membrane yeast two-hybrid raquo
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
VI
PCA images and statistical analyses 19
Analysis of protein distances within complexes 21
Results and discussion 22
Longer linkers increase signal-to-noise ratio in large-scale screens 22
PCA signal reflects the super-organization of protein complexes 23
Longer linkers allow detection of more distant proteins in complexes 25
Conclusion 26
Acknowledgements 26
Conclusion geacuteneacuterale 43
Bibliographie 46
VII
Liste des tableaux
Table S1A Description of the strains constructed and used for this study 30
Table S1B PCA data for global PCA experiment 30
Table S1C PCA data for intra-complexes experiment 30
Table S1D PCR primers used in this study 30
Table S2A Distances between C-termini calculated from molecular modeling 31
Table S2B Identity between each RNApol structures and the experimental sequences 32
Table S2C Identity between proteasome structure and the experimental sequence 34
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II
and III and proteasome structures 37
VIII
Listes des figures
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization of
protein complexes 27
Figure 2 Longer linkers allow for the detection of more distant proteins within complexes
29
Figure S1 Data related to the PCA experiments 40
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins 42
IX
Listes des abreacuteviations
Pourcentage
degC Degreacute Celsius
Aring Aringngstroumlm
ADN Acide deacutesoxyribonucleacuteique
Amp Ampicilline
ARNm Acide ribonucleacuteique messager
BioID laquo Proximity-dependent biotinylation raquo
ClonNAT Nourseacuteothricine
COG laquo Conserved oligomeric Golgi raquo
DHFR Dihydrofolate reacuteductase
DMSO Dimeacutethylsulfoxyde
F[12] Fragment 12 de la DHFR
F[3] Fragment 3 de la DHFR
FDR Valeur P corrigeacutee
FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes
g Gramme
Gly ou G Glycine
h Heure
HygB Hygromycine B
Is Score drsquointeraction
L Litre
Log Logarithme
M Molaire
Min Minute
mL Millilitre
mM Millimolaire
MS Spectromeacutetrie de masse
MSMS Spectromeacutetrie de masse en tandem
MTX Meacutethotrexate
MYTH laquo Membrane yeast two-hybrid raquo
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
VII
Liste des tableaux
Table S1A Description of the strains constructed and used for this study 30
Table S1B PCA data for global PCA experiment 30
Table S1C PCA data for intra-complexes experiment 30
Table S1D PCR primers used in this study 30
Table S2A Distances between C-termini calculated from molecular modeling 31
Table S2B Identity between each RNApol structures and the experimental sequences 32
Table S2C Identity between proteasome structure and the experimental sequence 34
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II
and III and proteasome structures 37
VIII
Listes des figures
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization of
protein complexes 27
Figure 2 Longer linkers allow for the detection of more distant proteins within complexes
29
Figure S1 Data related to the PCA experiments 40
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins 42
IX
Listes des abreacuteviations
Pourcentage
degC Degreacute Celsius
Aring Aringngstroumlm
ADN Acide deacutesoxyribonucleacuteique
Amp Ampicilline
ARNm Acide ribonucleacuteique messager
BioID laquo Proximity-dependent biotinylation raquo
ClonNAT Nourseacuteothricine
COG laquo Conserved oligomeric Golgi raquo
DHFR Dihydrofolate reacuteductase
DMSO Dimeacutethylsulfoxyde
F[12] Fragment 12 de la DHFR
F[3] Fragment 3 de la DHFR
FDR Valeur P corrigeacutee
FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes
g Gramme
Gly ou G Glycine
h Heure
HygB Hygromycine B
Is Score drsquointeraction
L Litre
Log Logarithme
M Molaire
Min Minute
mL Millilitre
mM Millimolaire
MS Spectromeacutetrie de masse
MSMS Spectromeacutetrie de masse en tandem
MTX Meacutethotrexate
MYTH laquo Membrane yeast two-hybrid raquo
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
VIII
Listes des figures
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization of
protein complexes 27
Figure 2 Longer linkers allow for the detection of more distant proteins within complexes
29
Figure S1 Data related to the PCA experiments 40
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins 42
IX
Listes des abreacuteviations
Pourcentage
degC Degreacute Celsius
Aring Aringngstroumlm
ADN Acide deacutesoxyribonucleacuteique
Amp Ampicilline
ARNm Acide ribonucleacuteique messager
BioID laquo Proximity-dependent biotinylation raquo
ClonNAT Nourseacuteothricine
COG laquo Conserved oligomeric Golgi raquo
DHFR Dihydrofolate reacuteductase
DMSO Dimeacutethylsulfoxyde
F[12] Fragment 12 de la DHFR
F[3] Fragment 3 de la DHFR
FDR Valeur P corrigeacutee
FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes
g Gramme
Gly ou G Glycine
h Heure
HygB Hygromycine B
Is Score drsquointeraction
L Litre
Log Logarithme
M Molaire
Min Minute
mL Millilitre
mM Millimolaire
MS Spectromeacutetrie de masse
MSMS Spectromeacutetrie de masse en tandem
MTX Meacutethotrexate
MYTH laquo Membrane yeast two-hybrid raquo
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
IX
Listes des abreacuteviations
Pourcentage
degC Degreacute Celsius
Aring Aringngstroumlm
ADN Acide deacutesoxyribonucleacuteique
Amp Ampicilline
ARNm Acide ribonucleacuteique messager
BioID laquo Proximity-dependent biotinylation raquo
ClonNAT Nourseacuteothricine
COG laquo Conserved oligomeric Golgi raquo
DHFR Dihydrofolate reacuteductase
DMSO Dimeacutethylsulfoxyde
F[12] Fragment 12 de la DHFR
F[3] Fragment 3 de la DHFR
FDR Valeur P corrigeacutee
FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes
g Gramme
Gly ou G Glycine
h Heure
HygB Hygromycine B
Is Score drsquointeraction
L Litre
Log Logarithme
M Molaire
Min Minute
mL Millilitre
mM Millimolaire
MS Spectromeacutetrie de masse
MSMS Spectromeacutetrie de masse en tandem
MTX Meacutethotrexate
MYTH laquo Membrane yeast two-hybrid raquo
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
X
NaCl Chlorure de sodium
NMR Reacutesonance magneacutetique nucleacuteaire
OD Densiteacute optique
PBS Tampon phosphate salin
PCA Compleacutementation de fragments proteacuteiques
PCR Reacuteaction en chaicircne de polymeacuterisation
PKA Proteacuteine kinase A
PPI Interaction proteacuteine-proteacuteine
Q1 Quartile 1
Q3 Quartile 3
r Coefficient de correacutelation
RNApol ARN polymeacuterase
Sdb Deacuteviation standard
Ser ou S Seacuterine
SDS Sodium dodeacutecyl sulfate
SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate
t-test Test de Student
YPD Extrait de levures peptone dextrose
Y2H Double hybride
Zs Score Z
microb Moyenne estimeacutee
microg Microgramme
microL Microlitre
microM Micromolaire
2YT 2 extraits de levures tryptone
2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
XI
Remerciements
Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens
sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon
directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur
de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les
moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes
de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert
des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles
Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr
Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet
Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux
professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la
science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur
disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite
eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur
excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son
entraide sa disponibiliteacute et les discussions entraicircnantes
Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes
supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme
un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une
compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les
rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant
au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave
Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son
sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et
Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller
lors de difficulteacutes
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
XII
Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis
Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni
non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes
mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire
de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des
responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu
deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un
support moral
Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc
est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son
savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout
moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes
craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour
deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et
professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours
reconnaissante Merci mon amour merci pour tout
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
XIII
Avant-propos
Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui
sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant
de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application
pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec
Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute
(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de
ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au
baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La
reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et
Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites
conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme
Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee
dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of
protein interactomes reveals their mechanisms of regulation robustness and insights into
genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau
(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette
(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger
(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce
meacutemoire
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
1
Introduction geacuteneacuterale
11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine
Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du
vivant Leurs associations temporaires ou permanentes sont au cœur des voies de
signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent
interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les
interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les
interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la
cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le
maintien des fonctions cellulaires
Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les
processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination
spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies
comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes
catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute
de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et
reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres
plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le
vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est
relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)
En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre
proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun
complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un
contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association
entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence
drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines
individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe
proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines
obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
2
est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux
sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie
(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une
cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple
(14-16)
Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-
proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau
drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour
comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute
Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment
lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)
Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces
cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en
consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex
environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles
ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en
perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur
lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel
environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer
une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies
12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine
Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la
meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la
comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du
trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees
preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce
complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et
controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et
collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite
divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
3
de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet
drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa
le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute
systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le
reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le
recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces
complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait
fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait
complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines
essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues
pour la robustesse (31)
Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux
meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe
proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber
seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a
deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major
Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les
infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les
bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe
srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies
mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait
responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les
reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes
perturbations (36)
13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions
proteacuteine-proteacuteine
Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont
eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent
toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-
ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
4
diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition
des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions
physiques entre deux proteacuteines
La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique
soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la
spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de
meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-
hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment
complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est
tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal
lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte
eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes
(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo
(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui
permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans
qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc
agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces
meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent
des informations sur les relations spatiales entre les proteacuteines
Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun
cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles
maintiennent ensemble
131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification
de complexes proteacuteiques suivie de la spectromeacutetrie de masse
La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une
meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs
techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie
drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun
extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par
un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
5
afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite
les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides
et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison
des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines
retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de
masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et
fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre
additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe
drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la
seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal
inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de
leur eacutetude (43)
132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques
1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de
fragments proteacuteiques
La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments
rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les
deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs
srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal
Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute
permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique
Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du
galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de
plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune
part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins
des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines
membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part
puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre
localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines
Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
6
neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est
par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou
lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore
largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain
dans un modegravele plus simple (51)
En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave
laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les
proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute
activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-
ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires
insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la
Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-
50 52 53)
La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct
que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute
cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements
deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments
rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme
proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves
lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme
rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la
cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance
cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN
(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-
agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique
possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines
eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la
localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene
ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)
Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
7
donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer
avec la seacutequence signal de localisation des proteacuteines (57)
Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de
fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou
lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en
eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit
lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou
sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les
connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont
geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine
drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure
seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions
de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur
flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute
de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment
rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave
chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est
essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes
(58 59)
1322 Meacutethodes hybrides
Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi
de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible
reacutesolution les associations proteacuteine-proteacuteine
Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute
lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on
veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet
lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune
de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la
proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
8
une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement
partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-
63)
Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification
et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par
des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des
informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique
Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner
potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette
meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)
Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les
proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante
deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un
groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par
MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des
interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille
supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves
utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine
drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)
14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine
Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles
donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des
proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement
accessible En plus de leur complexiteacute les techniques existantes demandent des
infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement
applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande
simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes
proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un
compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
9
compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la
cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise
de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de
nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe
15 Le connecteur un paramegravetre potentiellement inteacuteressant pour
moduler la deacutetection des interactions proteacuteine-proteacuteine
En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux
proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode
hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux
reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne
flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement
cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre
basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des
proteacuteines agrave lrsquoeacutetude
La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave
deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant
deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN
polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait
35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se
situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des
deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen
augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun
reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le
fait mecircme drsquoobtenir de nouvelles informations structurelles
16 Objectifs de recherche
Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre
capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la
longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en
plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
10
adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave
deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le
premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur
la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les
associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques
ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus
Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur
sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes
Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute
eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved
oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de
longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la
longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus
eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines
contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats
expeacuterimentaux
Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet
la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute
de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et
lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par
ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans
divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de
reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des
maladies humaines (70)
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
11
Measuring proximate protein association in living cells using
Protein-fragment complementation assay (PCA)
Reacutesumeacute
La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment
les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs
agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments
proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les
contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que
lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments
DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des
interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil
ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique
in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN
polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos
reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
12
Abstract
Understanding the function of cellular systems requires to catalogue how proteins assemble
with each other into complexes and to determine their spatial relationships Here we examine
the potential of the yeast Protein-fragment Complementation Assay based on the
dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein
complexes We show that the use of longer peptide linkers between the fusion proteins and
the DHFR fragments significantly improves the detection of protein-protein interactions and
allows to reveal interactions further in space Longer linkers thus provide an enhanced tool
for the detection and measurements of protein-protein interactions and protein proximity in
living cells We use this tool to further investigate the architecture of the RNA polymerases
the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results
open new avenues for the dissection of protein networks in living cells
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
13
Introduction
Protein-protein interactions (PPIs) are central to all cellular functions and are largely
responsible for translating genotypes into phenotypes (1) Investigations into the organization
of PPI networks have revealed important insights into the evolution of cellular functions (30
31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have
shown how the regulation of protein expression at the transcriptional translational and
posttranslational levels contributes to the diversity of protein complex assemblies (76-80)
Methods used to investigate the organization of PPIs can be grouped into two main categories
based on whether they infer co-complex memberships or detect physical association (81)
The first category includes methods based on protein purification followed by mass-
spectrometry In this case protein assignment to a specific complex is dependent on stable
association among proteins that survive cell lysis and fractionation or affinity purification
(82 83) The majority of PPIs that populate interactome databases derive from such methods
because a single purification leads to the inference of many interactions among the co-
purified proteins Unfortunately very little is known about the structural and context
dependencies of PPIs inferred from co-complex membership because detecting an
association does not provide information on the spatial organization of the complex (84-86)
The second category of methods reports binary or pairwise interactions between proteins and
reveals direct or nearly direct interactions Such methods include the commonly used yeast-
two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and
technologies based on similar principles (52) These methods are potentially complementary
because on the one hand they tell us which proteins assemble into complexes in the cell and
on the other hand how proteins may be physically located relative to one another (84 88)
Despite this recent progress there is still a need for tools that can detect proximate
relationships among proteins in vivo which would complement and further enhance our
ability to infer the relationships among proteins within and between complexes or
subcomplexes Being able to infer such relationships at different levels of resolution in living
cells is key to future development in cell and systems biology because high-resolution
methods such as NMR or X-ray crystallography are not yet amenable to high-throughput
analysis and cannot be applied to all protein types PCA (87 89) may provide the
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
14
technological advantages required for such an approach by complementing methods
detecting co-complex membership and direct interactions
PCA relies on the fusion of two proteins of interest with fragments of a reporter protein
usually at their C-terminus Upon interaction the two fragments assemble into a functional
protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are
usually connected to the reporter fragments with a linker of ten amino acids In principle the
length of the linker limits the maximum distance between the proteins for an interaction to
be detectable In the first large-scale study performed using DHFR PCA in yeast it was
shown that distance constraint determined by linker length could affect the ability to detect
PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein
complexes for which the distance between C-termini of proteins could be measured protein
interactions were 35 times more likely to be detected if the C-termini were within less than
82 Aring of each other In addition an earlier study in mammalian cells showed that increasing
linker length of the PCA reporter allows to detect configuration changes in a dimeric
membrane receptor (69) Together these results suggest that linkers of variable sizes could
improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances
between proteins in living cells Here we test the effect of linker size on the ability to detect
PPIs by PCA in living cells using the yeast DHFR PCA
Material and Methods
Yeast
Yeast strains used in this study were constructed (as described below) or are from the Yeast
Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆
met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were
grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for
solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL
hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA
experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino
acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without
adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
15
Bacteria
Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were
grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and
2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)
Plasmid construction
Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as
templates to create new plasmids containing DHFR fragments fused to a linker of varying
size Both original plasmids contained the sequence coding for two repetitions of the motif
Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for
the 4xL) were introduced between the linker present and the DHFR fragments resulting in
plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-
linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were
composed of synonymous codons leading to the same peptide sequence
In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and
4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and
inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The
3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The
plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The
fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted
on gel The fragments and plasmids were assembled by Gibson cloning (95) with an
insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were
selected on 2YT+Amp Finally positive clones were verified and confirmed by double
digestion with XbaI and BamHI and Sanger sequencing
The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct
the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR
amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-
ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR
F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-
linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
16
corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The
remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-
ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441
Strain construction
Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]
fusions respectively (Table S1A) All fusions were performed at the 3 end of genes
2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for
DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were
amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to
fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741
and BY4742 competent cells were transformed with the amplified modules following
standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged
strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all
strains confirmed proper DHFR fragment fusions
Estimation of protein abundance
Protein quantification was done for several strains with proteins fused with the 2xL and 4xL
by Western blot These proteins were selected because we could easily assess their abundance
using antibodies tagged against them 20 OD600 of exponentially growing cells were
resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL
Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads
(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific
Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants
were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were
separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE
gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device
(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC
membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p
anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or
Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during
2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
17
membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)
IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG
(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in
PBS + 02 Tween 20 were performed and signal on membranes was detected using
Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM
Lite software
Protein-fragment complementation assays
For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR
F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495
strains) were selected according to the criteria that they were belonging to the same
complexes as the baits or that they were interacting with one of them based on data reported
in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found
in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey
was present in four replicates two on each prey plate so each interaction was measured four
times Preys were randomly positioned to avoid location biases
For the intra-complexes experiment we performed a review of the literature and considered
the consensus protein complexes published by (84) to choose 95 central and associated
proteins members of the following complexes the RNApol I II and III the proteasome and
the COG complex These complexes were selected because they vary in size (RNApol I
(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44
tested) and COG complex (n=8)) and interactions among protein members of these
complexes have been shown to be detectable at least partially by DHFR PCA In addition
there are published structures available for the RNApol and proteasome complexes making
it possible to compare our results with known protein complex organization We successfully
constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the
RNApol and proteasome respectively and 100 for the COG complex In total 286 strains
harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation
of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least
one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two
different prey plates of MATa cells were generated including all strains mentioned above
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
18
Baits and preys were positioned in a way that in a block of four strains all combinations of
linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-
4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and
COG complexes and in 16 replicates for the proteasome complex The blocks were randomly
positioned on the colony arrays Each 1536-array was finally designed to contain a double
border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid
any border effects on the growth of the colonies
Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa
cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and
incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a
384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot
(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were
assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool
Colonies were further condensed in 384-format arrays and finally in 1536-format arrays
using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-
format were generated and replicated a few times to have enough cells to perform crosses
with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-
prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds
of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of
two days at 30degC per round Finally diploid strains were replicated on MTX medium and
incubated at 30degC for four days after which a second round of MTX selection was performed
Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel
T3i camera (Canon) each day from the second round of diploid selection to the end of the
experiment
For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that
differences in signal were increased null or decreased The same procedure as described
above was used to assess the growth on MTX medium of selected diploid cells resulting from
a new cross between bait and prey strains Correlation between the results of the two
experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed
results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
19
(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions
to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were
performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media
Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel
T3i camera (Canon)
PCA images and statistical analyses
For the initial screen colony size was estimated by measuring number of pixels using the
integrated intensity function as implemented in a custom script in ImageJ64 144o We
applied an image correction where the intensity of each pixel was extracted and the pixel
intensity matrix was smoothened using a two-way median polish and averaged with the raw
image We then converted the images to binary files and a manual threshold was applied
across plates We selected colonies for measurement with a circular selection using particle
detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles
touching the edge of the selection and those that had an area inferior to 20 pixels and
circularity inferior to 05 using the particle that is closest to the center We considered the
particle as being a colony if the mass center was within the mid-distance between two
colonies All plate images were also examined The average of the background pixels was
subtracted from the colony intensity
Colony intensity values from day 4 of growth of the second MTX selection were log2
transformed after adding 1 to each value to avoid null values All colonies with a size smaller
than 16 on the diploid selection plate were eliminated
For the global PCA experiment interactions with at least two replicates for all linker
combinations were conserved and the median of colony size was used as the interaction score
(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of
interaction scores was modeled as a mixture of two normal distributions using the R package
mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard
deviation (sdb) of the background distribution was used to convert each interaction score into
a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as
significant detected interactions These Zs were used to compare the same interaction with
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
20
different linker size combinations We considered significant changes when Zs differed by
more than 2
For the intra-complexes experiment extreme outliers on the MTX selection plates that were
more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and
Q3 represent first and third quartiles) Colonies corresponding to the control interaction and
positioned on the array edges were removed from downstream analyses as well as strains for
which sequencing results revealed mutations in the DHFR fusion proteins After these final
filtering steps interactions with at least four replicates for every linker combinations were
conserved and the median of colony size was used as the Is Significant interactions were
identified as described above (Fig S1B) For the RNApol and the proteasome the estimated
mean (b) and standard deviation (sdb) of the background distribution were calculated for
each linker combination and each complex separately For the COG complex because the
number of pairwise interactions is limited to 64 all the results were combined to calculate
these parameters An interaction was considered as being detected when the Zs was larger
than 25 From the 236 protein pairs presenting detected interactions with at least one linker
combination some pairs were filtered out mainly because they did not pass all of the
thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented
incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs
of interacting proteins
At this step pairs of interacting proteins presenting a new interaction (ie the interaction was
not detected with the reference linker size (2xL-2xL) but was detected with a longer linker
combination) were separated from others and classified as new interactions (Table S1C) For
the remaining pairs because baits and preys were positioned in a way that in a block of four
adjacent strains all combinations of linker lengths could be tested for a specific interaction
(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations
could be compared directly The difference with the reference 2xL-2xL interaction was
calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was
used to discriminate significant difference in colony size (with FDR corrected p-values)
These pairs of interacting proteins were separated in two additional categories unchanged
interactions in cases where the interaction was detected with the reference linker size (2xL-
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
21
2xL) and also with the longer linker combinations but without any significant change (t-test
FDR p-value above 005) and quantitative changes in cases where the interaction was
detected with the reference linker size (2xL-2xL) and presented significant changes for at
least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test
FDR p-value lt 005) (Table S1C)
Analysis of protein distances within complexes
Yeast protein sequences of the RNApol I II and III were obtained from SGD
(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein
complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software
PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for
the RNApol I II and III respectively as they included the largest number of proteins from the
experimental set with the highest sequence identities Similarly structure 4C2M was selected
as the representative RNApol I dimeric complex Table S2B presents the identity between
each RNApol structures and the experimental sequences
The proteasome is composed of three sections the barrel-shaped core particle the base and
the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in
the RCSB protein data bank at the time of the analyses Sequence alignment of the
experimental protein sequences of the individual sections of the proteasome complex with
the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure
PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4
is composed of a full core A complete proteasome structure was built by superposing two
PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super
command in PyMOL software Visual inspection of the resulting superposed 5A5B structures
showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in
5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the
outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A
summarizes the methodology used to build the final proteasome structure Table S2C
presents the identity between the built structure and the experimental sequences
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
22
The distances between the different proteins within a complex were calculated between C-
terminal residues In several cases the structure of the protein is not complete in the C-
terminal section In these cases the last available residue was used instead to calculate the
distance (a list is provided in Table S2D) The distances were calculated from the weighted
shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest
path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as
nodes to build the graph The edges of the graph were placed between each pair of nodes
using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight
of the edges was equal to the distance between node pairs Surface residues were identified
as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo
and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol
II complex and of 20 Å for the proteasome respectively These dots were exported in the
ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues
within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome
structure were considered as surface residues (see Fig S2D for a representation of the method
for the proteasome) In cases where multiple copies of the proteins were present within the
complexes the mean of the minimal distances possible was used for the analyses
All PPIs data related to the global PCA and intra-complexes experiments can be found in
Table S1B and S1C
Results and discussion
Longer linkers increase signal-to-noise ratio in large-scale screens
The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS
(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include
three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as
PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to
be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer
linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of
protein degradation was found for any of the six proteins examined using antibodies targeting
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
23
the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability
it has a minor effect that is not generalized
To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we
constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL
3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony
arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]
(with regular 2xL) These include proteins known to interact with the baits that are within
the same complexes as the baits or that are random proteins used as controls for a total of
26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126
PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left
panel) revealing a significant increase in signal-to-noise ratio with longer linkers
particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score
differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as
compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that
reduce signal due to the fusion of the DHFR fragments Four out of nine increased
interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with
standard linkers suggesting that longer linkers may allow for the detection of PPIs that are
not necessarily direct Moreover the four interactions with the highest PCA signal represent
cases between baits and preys within the same complexes suggesting that there is no decrease
in specificity with the elongated linkers Finally for the cases where proteins were not in the
same complex or were not previously shown to interact it is likely that they represent actual
interactions previously undetected in living cells For example many genetic interactions and
physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton
and the proteasome (97 98) Here we detect some interactions in living cells (such as
between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL
(Table S1B) All of these results thus show that the DHFR PCA with increased linker size
reveals new interactions and could be an improved tool to study inter-complex associations
PCA signal reflects the super-organization of protein complexes
To examine the effect of a longer linker on the detection of PPIs within complexes we
selected five complexes (RNApol I II and III proteasome and COG complexes) which
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
24
differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-
2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between
the RNApol I II and III and COG complex were also performed Among the 10192 unique
tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)
representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-
DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one
PPI) after filtration
As expected no interaction was detected between the RNApol and COG proteins Moreover
reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR
F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost
60 of interacting pairs (135228 or 114197 unique) no significant change on the
interaction strength was observed when using the 4xL compared to the 2xL reinforcing the
fact that no overall decrease in specificity is seen with the elongated linkers However the
increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)
PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74
(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length
can substantially widen the repertoire of detected interactions for a complex
In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the
detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-
4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL
combination In rare cases increasing linker length had an opposite effect leading to PPI
loss or signal reduction Rpo21 was particularly affected This protein one of the two largest
components of the RNApol II contributes to five out of the nine quantitatively decreased
interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))
but seems to lose all of the others This consequence may thus arise from steric effects rather
than through the destabilization of the protein (Fig 1D)
Quantitative changes were observed for about 5-10 of the detected PPIs across complexes
However a larger proportion (about 30-40) of new interactions were detected for RNApol
complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol
complexes more than half of the new interactions were found between proteins common to
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
25
the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the
individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved
Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D
center panel) In the COG complex new interactions were seen between Cog1 from the core
subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show
that doubling the linker length of central proteins in complexes expands the network of
interactions detected by DHFR PCA and helps to better describe the organization of protein
complexes in living cells
In addition to uncovering new interactions PCA signal using longer linkers allowed better
discrimination between the different subunits of large complexes This is particularly well
illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when
the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)
regardless of the linker length though the fraction is systematically higher with longer linkers
The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and
right panels) Structural biology in living cells could thus gain from PPIs data obtained with
several linker lengths
Longer linkers allow detection of more distant proteins in complexes
Because structural data for the RNApol and proteasome complexes were available we tested
whether the PCA signal with longer linkers reflects at least partly the proximity of proteins
within complexes as suggested by the analysis on subcomplexes As a proxy for distance
we measured the shortest path between C-termini of the proteins of interest (Table S2A) We
find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the
proteasome the complex for which we have the most distance values a negative correlation
is observed between the pairwise distance and interaction z-score of PPIs for all lengths of
linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better
signal-to-noise ratio The enhanced ability to detect interactions at longer distances with
longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function
of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-
4xL combination than the other combinations (Fig 2B right panel) The density distribution
of distances within complexes is also slightly shifted towards larger distances for longer
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
26
linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)
Finally we find that distance among proteins is significantly longer for cases where longer
linker size increases signal or leads to the detection of new interactions (Fig 2C) This
demonstrate once again that longer linker size enhances the ability to detect interactions
especially for proteins that are more distant in space
Conclusion
Understanding the molecular organization of the cell at the scale of protein complexes
remains challenging largely because it is difficult to study how proteins interact directly and
indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure
protein proximity in living cells and among endogenously expressed proteins Here we show
that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to
detect interactions in these specific conditions with an increased signal-to-noise ratio and
with an enhanced ability to detect distant PPIs including interactions among complexes and
subcomplexes within large complexes Because a single longer linker is generally sufficient
to detect new interactions the current strains from the DHFR PCA collection could be used
as preys while requiring only the construction of baits with different linker sizes PCA is
therefore an addition to the other methods available to detect low resolution structural
information among subunits of complexes which include chemical cross-linking of protein
complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation
in mammalian cells (68) Despite major advances in these other technologies in the recent
years PCA will remain the simplest assay because it requires minimal infrastructure
investment and can be adapted for high-throughput screening which is still difficult to
achieve with other approaches
Acknowledgements
Funding for this project comes from Canadian Institute of Health Research Grants 299432
and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and
Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was
supported by a NSERC NRSA Scholarship The authors thank the members of the Landry
laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical
analyses
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
27
Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment
complementation (PCA) screen and proves to be useful to infer the super-organization
of protein complexes
(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained
in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a
4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
28
triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-
complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome
Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly
decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new
PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker
combination) (C) Proportions of quantitatively changed interactions and new PPIs versus
unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR
F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of
all detected PPIs for selected complexes Line thickness is proportional to the difference
between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs
Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside
colored boxes represent proteins that were absent from the experiment (E) Proportion of
detected PPIs on total tested for each combination of subcomplexes within complexes
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
29
Figure 2 Longer linkers allow for the detection of more distant proteins within
complexes
(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at
least two out of the three RNApol Blue proteins specific to one RNApol Dark red
proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins
located at different distances or in different subunits are highlighted on each structure
Distances between C-termini of these selected proteins and the associated PPI z-scores for
these newly detected interactions are indicated in the tables DHFR fragments have also been
modeled and are presented at the same scale as the proteasome structure (B) (Left)
Correlation between all detected PPIs in the proteasome (z-scores) and the distance between
the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-
value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-
16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores
for the proteasome PPIs according to the different protein pairwise distances (C) Distribution
of three categories of detected PPIs for the RNApol and proteasome complexes according to
the distance between the C-termini for interactions that are not affected by longer linkers and
those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
30
Table S1A Description of the strains constructed and used for this study
Table S1A is too lengthy to be included in this document but can be obtained upon request
Table S1B PCA data for global PCA experiment
Table S1B is too lengthy to be included in this document but can be obtained upon request
Table S1C PCA data for intra-complexes experiment
Table S1C is too lengthy to be included in this document but can be obtained upon request
Table S1D PCR primers used in this study
Table S1D is too lengthy to be included in this document but can be obtained upon request
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
31
Table S2A Distances between C-termini calculated from molecular modeling
Table S2A is too lengthy to be included in this document but can be obtained upon request
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
32
Table S2B Identity between each RNApol structures and the experimental sequences
Reference Yeast proteins Complex Identity ()
4C2M chain 1 Rpc10 RNApol I 100
4C2M chain 2 Rpa34 RNApol I 924
4C2M chain 3 Rpa49 RNApol I 944
4C2M chain 4 Rpa43 RNApol I 100
4C2M chain 5 Rpa190 RNApol I 897
4C2M chain 6 Rpc40 RNApol I 100
4C2M chain 7 Rpa135 RNApol I 972
4C2M chain 8 Rpb5 RNApol I 100
4C2M chain 9 Rpa14 RNApol I 596
4C2M chain 10 Rpa43 RNApol I 814
4C2M chain 11 Rpo26 RNApol I 100
4C2M chain 12 Rpa12 RNApol I 100
4C2M chain 13 Rpb8 RNApol I 882
4C2M chain 14 Rpc19 RNApol I 100
4C2M chain 15 Rpb10 RNApol I 100
4C2M chain 16 Rpa49 RNApol I 100
4C2M chain 17 Rpc10 RNApol I 100
4C2M chain 18 Rpa43 RNApol I 100
4C2M chain 19 Rpa34 RNApol I 924
4C2M chain 20 Rpa135 RNApol I 962
4C2M chain 21 Rpa190 RNApol I 885
4C2M chain 22 Rpa14 RNApol I 551
4C2M chain 23 Rpc40 RNApol I 100
4C2M chain 24 Rpo26 RNApol I 100
4C2M chain 25 Rpb5 RNApol I 100
4C2M chain 26 Rpb8 RNApol I 882
4C2M chain 27 Rpa43 RNApol I 802
4C2M chain 28 Rpb10 RNApol I 100
4C2M chain 29 Rpa12 RNApol I 96
4C2M chain 30 Rpc19 RNApol I 100
4C3I chain A Rpa190 RNApol I 892
4C3I chain C Rpc40 RNApol I 993
4C3I chain B Rpa135 RNApol I 982
4C3I chain E Rpb5 RNApol I 100
4C3I chain D Rpa14 RNApol I 551
4C3I chain G Rpa43 RNApol I 783
4C3I chain F Rpo26 RNApol I 100
4C3I chain I Rpa12 RNApol I 100
4C3I chain H Rpb8 RNApol I 847
4C3I chain K Rpc19 RNApol I 100
4C3I chain J Rpb10 RNApol I 100
4C3I chain M Rpa49 RNApol I 972
4C3I chain L Rpc10 RNApol I 100
4C3I chain N Rpa34 RNApol I 88
4V1N chain A Rpo21 RNApol II 979
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
33
4V1N chain C Rpb3 RNApol II 100
4V1N chain B Rpb2 RNApol II 936
4V1N chain E Rpb5 RNApol II 100
4V1N chain D Rpb4 RNApol II 808
4V1N chain G Rpb7 RNApol II 100
4V1N chain F Rpo26 RNApol II 100
4V1N chain I Rpb9 RNApol II 100
4V1N chain H Rpb8 RNApol II 91
4V1N chain K Rpb11 RNApol II 100
4V1N chain J Rpb10 RNApol II 100
4V1N chain L Rpc10 RNApol II 100
4V1N chain R Tfg2 RNApol II 603
5FJA chain A Rpo31 RNApol III 962
5FJA chain C Rpc40 RNApol III 100
5FJA chain B Ret1 RNApol III 100
5FJA chain E Rpb5 RNApol III 100
5FJA chain D Rpc17 RNApol III 739
5FJA chain G Rpc25 RNApol III 858
5FJA chain F Rpo26 RNApol III 100
5FJA chain I Rpc11 RNApol III 827
5FJA chain H Rpb8 RNApol III 945
5FJA chain K Rpc19 RNApol III 100
5FJA chain J Rpb10 RNApol III 100
5FJA chain M Rpc37 RNApol III 849
5FJA chain L Rpc10 RNApol III 100
5FJA chain O Rpc82 RNApol III 843
5FJA chain N Rpc53 RNApol III 738
5FJA chain Q Rpc31 RNApol III 100
5FJA chain P Rpc34 RNApol III 572
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
34
Table S2C Identity between proteasome structure and the experimental sequence
Reference Yeast
proteins Complex
Identity
()
5CZ4-centered chain A Pre8 Proteasome 100
5CZ4-centered chain AA Pre4 Proteasome 100
5CZ4-centered chain B Pre9 Proteasome 100
5CZ4-centered chain BA Pre3 Proteasome 100
5CZ4-centered chain C Pre6 Proteasome 100
5CZ4-centered chain D Pup2 Proteasome 971
5CZ4-centered chain E Pre5 Proteasome 100
5CZ4-centered chain F Pre10 Proteasome 100
5CZ4-centered chain G Scl1 Proteasome 100
5CZ4-centered chain H Pup1 Proteasome 100
5CZ4-centered chain I Pup3 Proteasome 100
5CZ4-centered chain J Pre1 Proteasome 100
5CZ4-centered chain K Pre2 Proteasome 100
5CZ4-centered chain L Pre7 Proteasome 100
5CZ4-centered chain M Pre4 Proteasome 100
5CZ4-centered chain N Pre3 Proteasome 100
5CZ4-centered chain O Pre8 Proteasome 100
5CZ4-centered chain P Pre9 Proteasome 100
5CZ4-centered chain Q Pre6 Proteasome 100
5CZ4-centered chain R Pup2 Proteasome 971
5CZ4-centered chain S Pre5 Proteasome 100
5CZ4-centered chain T Pre10 Proteasome 100
5CZ4-centered chain U Scl1 Proteasome 100
5CZ4-centered chain V Pup1 Proteasome 100
5CZ4-centered chain W Pup3 Proteasome 100
5CZ4-centered chain X Pre1 Proteasome 100
5CZ4-centered chain Y Pre2 Proteasome 100
5CZ4-centered chain Z Pre7 Proteasome 100
5A5B-centered chain A Pre3 Proteasome 100
5A5B-centered chain AA Rpn7 Proteasome 100
5A5B-centered chain B Pup1 Proteasome 100
5A5B-centered chain BA Rpn3 Proteasome 100
5A5B-centered chain C Pup3 Proteasome 100
5A5B-centered chain CA Rpn12 Proteasome 100
5A5B-centered chain D Pre1 Proteasome 100
5A5B-centered chain DA Rpn8 Proteasome 829
5A5B-centered chain E Pre2 Proteasome 995
5A5B-centered chain EA Rpn11 Proteasome 895
5A5B-centered chain F Pre7 Proteasome 100
5A5B-centered chain FA Rpn10 Proteasome 100
5A5B-centered chain G Pre4 Proteasome 100
5A5B-centered chain GA Rpn13 Proteasome 100
5A5B-centered chain HA Sem1 Proteasome 100
5A5B-centered chain IA Rpn1 Proteasome 859
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
35
5A5B-centered chain J Scl1 Proteasome 100
5A5B-centered chain K Pre8 Proteasome 100
5A5B-centered chain L Pre9 Proteasome 100
5A5B-centered chain M Pre6 Proteasome 100
5A5B-centered chain N Pup2 Proteasome 100
5A5B-centered chain O Pre5 Proteasome 100
5A5B-centered chain P Pre10 Proteasome 100
5A5B-centered chain Q Rpt1 Proteasome 88
5A5B-centered chain R Rpt2 Proteasome 100
5A5B-centered chain S Rpt6 Proteasome 100
5A5B-centered chain T Rpt3 Proteasome 100
5A5B-centered chain U Rpt4 Proteasome 100
5A5B-centered chain V Rpt5 Proteasome 931
5A5B-centered chain W Rpn2 Proteasome 909
5A5B-centered chain X Rpn9 Proteasome 100
5A5B-centered chain Y Rpn5 Proteasome 100
5A5B-centered chain Z Rpn6 Proteasome 100
Constructed proteasome chain 1 Pup1 Proteasome 100
Constructed proteasome chain 10 Pre8 Proteasome 100
Constructed proteasome chain 11 Pre9 Proteasome 100
Constructed proteasome chain 12 Pre6 Proteasome 100
Constructed proteasome chain 13 Pup2 Proteasome 100
Constructed proteasome chain 14 Pre5 Proteasome 100
Constructed proteasome chain 15 Pre10 Proteasome 100
Constructed proteasome chain 16 Rpt1 Proteasome 88
Constructed proteasome chain 17 Rpt2 Proteasome 100
Constructed proteasome chain 18 Rpt6 Proteasome 100
Constructed proteasome chain 19 Rpt3 Proteasome 100
Constructed proteasome chain 2 Pup3 Proteasome 100
Constructed proteasome chain 20 Rpt4 Proteasome 100
Constructed proteasome chain 21 Rpt5 Proteasome 931
Constructed proteasome chain 22 Rpn2 Proteasome 909
Constructed proteasome chain 23 Rpn9 Proteasome 100
Constructed proteasome chain 24 Rpn5 Proteasome 100
Constructed proteasome chain 25 Rpn6 Proteasome 100
Constructed proteasome chain 26 Rpn7 Proteasome 100
Constructed proteasome chain 27 Rpn3 Proteasome 100
Constructed proteasome chain 28 Rpn12 Proteasome 100
Constructed proteasome chain 29 Rpn8 Proteasome 829
Constructed proteasome chain 3 Pre1 Proteasome 100
Constructed proteasome chain 30 Rpn11 Proteasome 895
Constructed proteasome chain 31 Rpn10 Proteasome 100
Constructed proteasome chain 32 Rpn13 Proteasome 100
Constructed proteasome chain 33 Sem1 Proteasome 100
Constructed proteasome chain 34 Rpn1 Proteasome 859
Constructed proteasome chain 35 Pup1 Proteasome 100
Constructed proteasome chain 36 Pup3 Proteasome 100
Constructed proteasome chain 37 Pre1 Proteasome 100
Constructed proteasome chain 38 Pre2 Proteasome 100
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
36
Constructed proteasome chain 39 Pre7 Proteasome 100
Constructed proteasome chain 4 Pre2 Proteasome 100
Constructed proteasome chain 40 Pre4 Proteasome 100
Constructed proteasome chain 41 Pre3 Proteasome 100
Constructed proteasome chain 42 Pre4 Proteasome 100
Constructed proteasome chain 45 Scl1 Proteasome 100
Constructed proteasome chain 46 Pre8 Proteasome 100
Constructed proteasome chain 47 Pre9 Proteasome 100
Constructed proteasome chain 48 Pre6 Proteasome 100
Constructed proteasome chain 49 Pup2 Proteasome 100
Constructed proteasome chain 5 Pre7 Proteasome 100
Constructed proteasome chain 50 Pre5 Proteasome 100
Constructed proteasome chain 51 Pre10 Proteasome 100
Constructed proteasome chain 52 Rpt1 Proteasome 88
Constructed proteasome chain 53 Rpt2 Proteasome 100
Constructed proteasome chain 54 Rpt6 Proteasome 100
Constructed proteasome chain 55 Rpt3 Proteasome 100
Constructed proteasome chain 56 Rpt4 Proteasome 100
Constructed proteasome chain 57 Rpt5 Proteasome 931
Constructed proteasome chain 58 Rpn2 Proteasome 909
Constructed proteasome chain 59 Rpn9 Proteasome 100
Constructed proteasome chain 6 Pre3 Proteasome 100
Constructed proteasome chain 60 Rpn5 Proteasome 100
Constructed proteasome chain 61 Rpn6 Proteasome 100
Constructed proteasome chain 62 Rpn7 Proteasome 100
Constructed proteasome chain 63 Rpn3 Proteasome 100
Constructed proteasome chain 64 Rpn12 Proteasome 100
Constructed proteasome chain 65 Rpn8 Proteasome 829
Constructed proteasome chain 66 Rpn11 Proteasome 895
Constructed proteasome chain 67 Rpn10 Proteasome 100
Constructed proteasome chain 68 Rpn13 Proteasome 100
Constructed proteasome chain 69 Sem1 Proteasome 100
Constructed proteasome chain 70 Rpn1 Proteasome 859
Constructed proteasome chain 9 Scl1 Proteasome 100
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
37
Table S2D Number of missing residues in C-termini of studied proteins in RNApol I
II and III and proteasome structures
Yeast proteins Complex Reference of missing residues in C-ter
Rpa190 RNApol I 4C2M monomer 1 0
Rpa14 RNApol I 4C2M monomer 1 37
Rpa12 RNApol I 4C2M monomer 1 0
Rpb5 RNApol I 4C2M monomer 1 0
Rpb10 RNApol I 4C2M monomer 1 1
Rpa49 RNApol I 4C2M monomer 1 300
Rpc19 RNApol I 4C2M monomer 1 0
Rpb8 RNApol I 4C2M monomer 1 0
Rpa34 RNApol I 4C2M monomer 1 52
Rpa43 RNApol I 4C2M monomer 1 10
Rpc40 RNApol I 4C2M monomer 1 0
Rpc10 RNApol I 4C2M monomer 1 0
Rpa135 RNApol I 4C2M monomer 1 0
Rpo26 RNApol I 4C2M monomer 1 1
Rpa190 RNApol I 4C2M monomer 2 0
Rpa14 RNApol I 4C2M monomer 2 37
Rpa12 RNApol I 4C2M monomer 2 0
Rpb5 RNApol I 4C2M monomer 2 0
Rpb10 RNApol I 4C2M monomer 2 1
Rpa49 RNApol I 4C2M monomer 2 300
Rpc19 RNApol I 4C2M monomer 2 0
Rpb8 RNApol I 4C2M monomer 2 0
Rpa34 RNApol I 4C2M monomer 2 53
Rpa43 RNApol I 4C2M monomer 2 76
Rpc40 RNApol I 4C2M monomer 2 0
Rpc10 RNApol I 4C2M monomer 2 0
Rpa135 RNApol I 4C2M monomer 2 0
Rpo26 RNApol I 4C2M monomer 2 1
Rpa190 RNApol I 4C3I 1
Rpa14 RNApol I 4C3I 37
Rpb5 RNApol I 4C3I 0
Rpb10 RNApol I 4C3I 1
Rpa49 RNApol I 4C3I 301
Rpc19 RNApol I 4C3I 0
Rpb8 RNApol I 4C3I 0
Rpa34 RNApol I 4C3I 53
Rpa12 RNApol I 4C3I 0
Rpa43 RNApol I 4C3I 10
Rpc40 RNApol I 4C3I 0
Rpc10 RNApol I 4C3I 0
Rpa135 RNApol I 4C3I 0
Rpo26 RNApol I 4C3I 1
Rpb3 RNApol II 4V1N 50
Rpb11 RNApol II 4V1N 6
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
38
Rpb5 RNApol II 4V1N 0
Rpb7 RNApol II 4V1N 0
Rpb10 RNApol II 4V1N 5
Rpo26 RNApol II 4V1N 0
Rpb8 RNApol II 4V1N 0
Rpb4 RNApol II 4V1N 0
Rpb9 RNApol II 4V1N 2
Tfg2 RNApol II 4V1N 173
Rpb2 RNApol II 4V1N 0
Rpc10 RNApol II 4V1N 0
Rpo21 RNApol II 4V1N 278
Rpc11 RNApol III 5FJA 0
Rpc19 RNApol III 5FJA 0
Ret1 RNApol III 5FJA 0
Rpb5 RNApol III 5FJA 0
Rpb10 RNApol III 5FJA 3
Rpc37 RNApol III 5FJA 20
Rpc82 RNApol III 5FJA 0
Rpc31 RNApol III 5FJA 182
Rpb8 RNApol III 5FJA 0
Rpc53 RNApol III 5FJA 0
Rpc25 RNApol III 5FJA 0
Rpc34 RNApol III 5FJA 2
Rpo31 RNApol III 5FJA 0
Rpc40 RNApol III 5FJA 0
Rpc10 RNApol III 5FJA 0
Rpc17 RNApol III 5FJA 0
Rpo26 RNApol III 5FJA 2
Rpn6 Proteasome 5CZ4 and 5A5B 3
Rpn5 Proteasome 5CZ4 and 5A5B 3
Rpn3 Proteasome 5CZ4 and 5A5B 45
Rpn2 Proteasome 5CZ4 and 5A5B 20
Rpn1 Proteasome 5CZ4 and 5A5B 0
Rpn9 Proteasome 5CZ4 and 5A5B 6
Rpn8 Proteasome 5CZ4 and 5A5B 30
Pre10 Proteasome 5CZ4 and 5A5B 39
Pre6 Proteasome 5CZ4 and 5A5B 10
Pre7 Proteasome 5CZ4 and 5A5B 0
Rpt3 Proteasome 5CZ4 and 5A5B 0
Rpt2 Proteasome 5CZ4 and 5A5B 1
Pre2 Proteasome 5CZ4 and 5A5B 0
Rpt4 Proteasome 5CZ4 and 5A5B 10
Pre1 Proteasome 5CZ4 and 5A5B 3
Pre8 Proteasome 5CZ4 and 5A5B 0
Pre9 Proteasome 5CZ4 and 5A5B 12
Pup2 Proteasome 5CZ4 and 5A5B 9
Pup3 Proteasome 5CZ4 and 5A5B 0
Pup1 Proteasome 5CZ4 and 5A5B 6
Rpn13 Proteasome 5CZ4 and 5A5B 23
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
39
Rpn12 Proteasome 5CZ4 and 5A5B 2
Rpn11 Proteasome 5CZ4 and 5A5B 8
Rpn10 Proteasome 5CZ4 and 5A5B 71
Sem1 Proteasome 5CZ4 and 5A5B 0
Scl1 Proteasome 5CZ4 and 5A5B 0
Rpt1 Proteasome 5CZ4 and 5A5B 11
Pre4 Proteasome 5CZ4 and 5A5B 4
Pre5 Proteasome 5CZ4 and 5A5B 0
Rpt5 Proteasome 5CZ4 and 5A5B 0
Pre3 Proteasome 5CZ4 and 5A5B 0
Rpt6 Proteasome 5CZ4 and 5A5B 9
Rpn7 Proteasome 5CZ4 and 5A5B 7
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
40
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
41
Figure S1 Data related to the PCA experiments
(A) Western blots confirming that the introduction of a longer linker does not impair protein
stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony
size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right
RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with
a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have
a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal
interactions with the 4xL-4xL combination Correlation coefficients for the other
combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)
Density of PPI z-scores for the proteasome for all combinations of linker lengths according
to the distance between the interacting proteins The red line represents the density of
distances for all interactions The distribution for detected interactions is shifted to the left
because proteins are closer to each other when the interactions are detected The 4xL-4xL
distributions is also slightly shifted to the right due to the ability of the 4xL to detect
interactions further in space (E) Repetition of the standard DHFR PCA for selected results
for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR
PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples
for each category of changes are shown Cell growth in spot-dilution assay (right) correlates
with colony size in standard PCA (left)
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
42
Figure S2 Illustration of the methods used to build the proteasome structure and to
calculate distances between proteins
(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB
structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on
the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two
5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap
between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)
(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core
(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5
Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance
weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for
distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots
surface Green spheres surface residues on the proteasome
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
43
Conclusion geacuteneacuterale
Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme
meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des
proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions
physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture
des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des
connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord
veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions
deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de
complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes
longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la
comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures
proteacuteiques disponibles du proteacuteasome
Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit
en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement
augmenteacute permettant une meilleure identification des associations Sept nouvelles
associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents
complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des
associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la
modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes
proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles
interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des
interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave
obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait
appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le
nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce
nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines
associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise
de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble
ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
44
Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur
seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des
proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations
structurales notamment en identifiant les associations les plus fortes au sein du complexe
Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes
proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les
associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-
complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et
les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du
connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees
dans lrsquoespace
La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des
associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du
fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations
proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le
connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait
drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus
dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de
mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater
la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle
probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle
complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la
DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement
simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave
grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une
meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les
fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont
tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu
solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs
conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
45
suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces
eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides
Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant
drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions
du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter
des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi
deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant
en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de
plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]
avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires
permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau
drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est
augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution
moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait
prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de
petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine
et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les
gros complexes proteacuteiques
La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante
pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la
composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie
eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite
grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo
Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes
respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress
cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome
de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par
lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur
architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une
meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
46
Bibliographie
1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
47
22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
48
43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
49
64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709
50
84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709