5
letter nature genetics • volume 24 • april 2000 363 Human LINE retrotransposons generate processed pseudogenes Cécile Esnault, Joël Maestre & Thierry Heidmann Unité des Rétrovirus Endogènes et Eléments Rétroïdes des Eucaryotes Supérieurs, CNRS UMR 1573, Institut Gustave Roussy, Villejuif Cedex, France. Correspondence should be addressed to T.H. (e-mail: [email protected]). Long interspersed elements (LINEs) are endogenous mobile genetic elements 1–4 that have dispersed and accumulated in the genomes of higher eukaryotes via germline transposition, with up to 100,000 copies in mammalian genomes. In humans, LINEs are the major source of insertional mutagenesis, being involved in both germinal and somatic mutant phenotypes 4 . Here we show that the human LINE retrotransposons, which transpose through the reverse transcription of their own transcript 2 , can also mobi- lize transcribed DNA not associated with a LINE sequence by a process involving the diversion of the LINE enzymatic machinery by the corresponding mRNA transcripts. This results in the ‘retroposition’ of the transcribed gene and the formation of new copies that disclose features characteristic of the widespread and naturally occurring processed pseudogenes: loss of intron and promoter, acquisition of a poly(A) 3´ end and presence of target- site duplications of varying length 5,6 . We further showby intro- ducing deletions within either coding sequence of the human LINEthat both ORFs are necessary for the formation of the processed pseudogenes, and that retroviral-like elements are not able to produce similar structures in the same assay. Our results strengthen the unique versatility of LINEs as genome modellers. To assay for processed pseudogene formation, we made use of reporter genes marked by the neo RT cassette 7,8 in which the neomycin gene is activated only after a cycle of transcription, reverse transcription and integration (that is, retroposition) of the marked gene (Fig. 1a). We introduced reporter genes into episomal vectors containing the phleomycin resistance gene. Expression vectors for the human LINE (L1; ref. 9) or retroviral elements were also constructed and inserted into episomal vec- tors with the hygromycin resistance gene. Complementation was performed upon co-transfection of mammalian cells (heterolo- gous feline cells) with both vectors and selection with hygromycin and phleomycin. We then selected the resulting cell populations with G418 to assay pseudogene formation (Fig. 1b). For the CMVneo RT reporter gene, we obtained G418 R clones in the presence of an expression vector for the human LINE (Fig. 2a) at a rate at least tenfold higher than that in the control without the LINE expression vector. Assay of the DNA of such clones for the presence of retroposed copies with a spliced-out intron was car- ried out by PCR, Southern-blot analysis and sequencing after cloning (Fig. 2ce). PCR amplification using primers bracketing the intronic domain of the reporter gene yielded a fragment of reduced size for all the clones tested (at least 20 clones tested per assay) in the LINE complementation assay (+CMVL1, Fig. 2c). In the same assay, none of the few clones obtained in the absence of LINE expression vector were positive (–CMVL1), indicating that these clones are probably associated with recombination or inte- gration of initial copies close to cellular promoters in the course of Fig. 1 Processed pseudogene formation and rationale of the assay. a, Schematic representation of a gene marked with the neo RT cassette, allowing detection of retroposition, and of the corresponding pseudogene, with splicing out of the intron, acquisition of a poly(A) tail (A n ), loss of promoter 5´ sequence and genera- tion of a target-site duplication of variable length at the integration site (filled triangles). In the original copy of the gene, the neomycin gene (neo) is not expressed due to the presence of a polyadenylation sequence (pA) placed between the neo coding sequence and the promoter (Pr) for neo. After transcription and reverse transcription, the pA sequence inserted within an intron (splice donor and acceptor, SD, SA) is deleted 7 . b, Experimental procedure for detection of processed pseudogene induction by LINE expression vectors. Neo RT -marked reporter genes are in episomal vectors with a phleomycin resistance gene and expres- sion vectors for LINE in episomal vectors with a hygromycin resistance gene. They are introduced into mammalian cells by co-transfection and cell transformants are isolated upon hygromycin plus phleomycin selection; the resulting cell populations are then assayed for processed pseudogene formation upon selection of a large amount of cells (at least 5 plates of 2×10 5 cells per plate) in a G418-containing medium. a b co-transfection neo inactive neo active © 2000 Nature America Inc. • http://genetics.nature.com © 2000 Nature America Inc. • http://genetics.nature.com

document

  • Upload
    joel

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

letter

nature genetics • volume 24 • april 2000 363

Human LINE retrotransposons generate processed pseudogenes

Cécile Esnault, Joël Maestre & Thierry Heidmann

Unité des Rétrovirus Endogènes et Eléments Rétroïdes des Eucaryotes Supérieurs, CNRS UMR 1573, Institut Gustave Roussy, Villejuif Cedex, France.Correspondence should be addressed to T.H. (e-mail: [email protected]).

Long interspersed elements (LINEs) are endogenous mobilegenetic elements1–4 that have dispersed and accumulated in thegenomes of higher eukaryotes via germline transposition, withup to 100,000 copies in mammalian genomes. In humans, LINEsare the major source of insertional mutagenesis, being involved inboth germinal and somatic mutant phenotypes4. Here we showthat the human LINE retrotransposons, which transpose throughthe reverse transcription of their own transcript2, can also mobi-lize transcribed DNA not associated with a LINE sequence by aprocess involving the diversion of the LINE enzymatic machineryby the corresponding mRNA transcripts. This results in the‘retroposition’ of the transcribed gene and the formation of newcopies that disclose features characteristic of the widespread andnaturally occurring processed pseudogenes: loss of intron andpromoter, acquisition of a poly(A) 3´ end and presence of target-site duplications of varying length5,6. We further show by intro-ducing deletions within either coding sequence of the humanLINE that both ORFs are necessary for the formation of theprocessed pseudogenes, and that retroviral-like elements are notable to produce similar structures in the same assay. Our resultsstrengthen the unique versatility of LINEs as genome modellers.To assay for processed pseudogene formation, we made use ofreporter genes marked by the neoRT cassette7,8 in which theneomycin gene is activated only after a cycle of transcription,

reverse transcription and integration (that is, retroposition) ofthe marked gene (Fig. 1a). We introduced reporter genes intoepisomal vectors containing the phleomycin resistance gene.Expression vectors for the human LINE (L1; ref. 9) or retroviralelements were also constructed and inserted into episomal vec-tors with the hygromycin resistance gene. Complementation wasperformed upon co-transfection of mammalian cells (heterolo-gous feline cells) with both vectors and selection withhygromycin and phleomycin. We then selected the resulting cellpopulations with G418 to assay pseudogene formation (Fig. 1b).

For the CMVneoRT reporter gene, we obtained G418R clones inthe presence of an expression vector for the human LINE (Fig. 2a)at a rate at least tenfold higher than that in the control without theLINE expression vector. Assay of the DNA of such clones for thepresence of retroposed copies with a spliced-out intron was car-ried out by PCR, Southern-blot analysis and sequencing aftercloning (Fig. 2c–e). PCR amplification using primers bracketingthe intronic domain of the reporter gene yielded a fragment ofreduced size for all the clones tested (at least 20 clones tested perassay) in the LINE complementation assay (+CMVL1, Fig. 2c). Inthe same assay, none of the few clones obtained in the absence ofLINE expression vector were positive (–CMVL1), indicating thatthese clones are probably associated with recombination or inte-gration of initial copies close to cellular promoters in the course of

Fig. 1 Processed pseudogene formation and rationale of the assay. a, Schematic representation of a gene marked with the neoRT cassette, allowing detection ofretroposition, and of the corresponding pseudogene, with splicing out of the intron, acquisition of a poly(A) tail (An), loss of promoter 5´ sequence and genera-tion of a target-site duplication of variable length at the integration site (filled triangles). In the original copy of the gene, the neomycin gene (neo) is notexpressed due to the presence of a polyadenylation sequence (pA) placed between the neo coding sequence and the promoter (Pr) for neo. After transcriptionand reverse transcription, the pA sequence inserted within an intron (splice donor and acceptor, SD, SA) is deleted7. b, Experimental procedure for detection ofprocessed pseudogene induction by LINE expression vectors. NeoRT-marked reporter genes are in episomal vectors with a phleomycin resistance gene and expres-sion vectors for LINE in episomal vectors with a hygromycin resistance gene. They are introduced into mammalian cells by co-transfection and cell transformantsare isolated upon hygromycin plus phleomycin selection; the resulting cell populations are then assayed for processed pseudogene formation upon selection of alarge amount of cells (at least 5 plates of 2×105 cells per plate) in a G418-containing medium.

a b

co-transfection

neo inactive

neo active

© 2000 Nature America Inc. • http://genetics.nature.com©

200

0 N

atu

re A

mer

ica

Inc.

• h

ttp

://g

enet

ics.

nat

ure

.co

m

cell transfection. This suggests that the frequency of occurrence ofretroposed copies in the control should be at least tenfold lowerthan the apparent frequency of recovered G418R clones (that is,<10–6), whereas it should be close to the apparent frequency in theLINE complementation assay (that is, 1–2×10–4; Fig. 2b). South-ern-blot analysis of a series of positive clones further provided evi-dence for retroposed copies with the intron spliced out: for six ofseven clones (Fig. 2d), a band hybridizing with a probe within thereporter gene was detected at the position expected for a new copyof the reporter gene without the intron. For the clone in lane 7, wedid not detect a band of the expected size for a retroposed copy,only a band of a higher molecular weight and a band correspond-ing to an unspliced copy of the reporter. This clone was found (byPCR analysis as in Fig. 2c) to contain a copy of the indicator genewith a spliced-out intron, suggesting that truncated cDNA copiesof the reporter gene are also generated upon retroposition, result-ing in bands of unexpected size; similar structures might also beresponsible for the occurrence of the bands (Fig. 2c, arrowheads)

that can be observed for some of the clones (Fig. 2c, lanes 1, 4). Tofurther determine the structure of the retroposed genes, wecloned some of them using the GenomeWalker kit. This assayrelies on restriction of genomic DNA of the G418R clones, fol-lowed by addition of linkers which can be used as primers forPCR. The 5´ and 3´ parts of retroposed genes are then amplifiedusing primers in the indicator gene to generate fragments con-taining the region of the spliced-out intron (as a control) andprimers in the added linkers. Among five clones analysed, the 3´end of the retroposed copies was sorted out in all cases, and the 5´end in three of five cases (Fig. 2e). In the latter cases, the integra-tion target site was also identified in the DNA from untransfectedcells by PCR using primers generated from the identified flankingsequences. We identified several features of the retroposed copies:(i) in all cases, the intron was found to be precisely spliced outfrom the retroposed copies; (ii) in three of five cases, the 3´ termi-nus of the retroposed copies was a poly(A) tail at the expectedposition for polyadenylation of the reporter gene transcript; and

Fig. 2 LINEs induce processed pseudogeneformation. a, Induction of G418R clones inthe presence of an expression vector for LINE,using the reporter gene and expression vec-tor shown on the left. CMV neoRT is a minimalreporter gene with a CMV promoter, a neoRT

cassette and a SV40 polyadenylationsequence. In the control assay without LINE,the LINE expression vector was replacedthroughout the experiments by an identicalexpression vector with lacZ in place of theLINE sequence. Results of the G418 selectionare shown for plates seeded with 2×105

hygromycin+phleomycin-resistant cells, withthe G418R foci fixed and stained. b, The num-ber of foci per plate is indicated, as the meanof eight independent transfection assayswith the standard error indicated (and therange in parenthesis). The resulting retropo-sition frequency is shown, following a controlPCR analysis of >20 clones per assay. c, DNAsfrom individual clones were assayed by PCRfor the presence of retroposed copies withthe neoRT intron spliced out, using primersbracketing the intronic domain. The first lanein the ethidium-bromide–stained agarose gelof the PCR products corresponds to DNA fromthe initial population of cells (P), before G418selection; the following six lanes, to randomly selected clones from the LINE complementation assay; and the last five lanes, to some of the few clones recovered inthe control assay without the LINE expression vector. Bands of the expected size for the spliced-out intron (387 bp) are observed only upon LINE complementation;the larger band of 1,019 bp observed in the initial population of cells before G418 selection, but not always in the resulting clones probably corresponds tounspliced initial copies of the reporter gene. d, Southern-blot analysis of LINE-induced retroposed copies of the reporter gene. Southern-blot analysis results in1,913-bp and 2,545-bp bands for spliced pseudogenes and unspliced initial copies, respectively (arrows), as well as bands of unexpected size (probably correspond-ing to truncated copies; arrowheads). e, Structure of retroposed genes and integration domains. Sequences of five retroposed genes are shown (asterisks are forretroposed copies from the β-globin-marked reporter gene), with the spliced out neoRT intron (and the spliced out β-globin intron 2) schematized with triangles,the poly(A) stretch indicated when present, and the deletions (either 5´ or 3´, none of which are identical among the various clones) indicated in brackets. The tar-get genomic DNA before integration and the target site duplications are shown as enlarged filled boxes with the length indicated and the sequences as follows(target site duplications in capital): ggcgcAGAAAGTGTCCACAGTGgagaa, taggtATGCGTTTATTTTCACGTGTAACAGGAAAAACATAACTAGCACATCACCATGTGACTGCGGgatat and actatTCTTggatt, for clones 54.1, 72.6 and 73.5, respectively. N.D., not determined.

letter

364 nature genetics • volume 24 • april 2000

a b c

d e

clones perplate ± s.e.

(0–17)

(7–102)

retropositionfrequency ± s.e.

© 2000 Nature America Inc. • http://genetics.nature.com©

200

0 N

atu

re A

mer

ica

Inc.

• h

ttp

://g

enet

ics.

nat

ure

.co

m

letter

nature genetics • volume 24 • april 2000 365

(iii) in two cases, the 3´ terminus was truncated (15-bp and 42-bpdeletions, 1 nt downstream and 20 nt upstream of the AATAAApolyadenylation signal respectively), but analysis of the 5´ end ofthe retroposed copies demonstrated the presence of a target siteduplication, which is the universal signature of a transpositionevent. A target site duplication was also observed for clone 54.1.These results show that retroposed copies of the reporter gene canbe found which disclose structures closely related to those com-monly detected in the genomes and classically referred to asprocessed pseudogenes: all of them lack introns (clones 72.6* and73.5*, which correspond to a β-globin reporter gene (Fig. 4), alsolack the β-globin intron 2 in addition to the neoRT intron); allthose for which both ends could be determined disclose a target-site duplication of variable length; and truncations were observedat either end, but, when non-truncated, a poly(A) stretch (34 to 88A) was present.

To obtain further insight into the role of LINE in retroposition,we made specific deletionswithin the LINE expressionvector (Fig. 3). These comprisea 2,137-bp deletion withinORF2 encompassing the LINEreverse transcriptase (RT) and a872-bp deletion within ORF1,which was made in-phase so asnot to perturb ORF2 initiationof translation within the bicis-tronic RNA. We ensured, by aPCR-based assay for LINE RTactivity10, that the LINE ORF2protein was still produced inthe ORF1 mutant and not inthe ORF2 mutant, as a control(Fig. 3b). Under these condi-tions, we found (Fig. 3c) thatboth LINE ORFs are requiredfor the generation of processedpseudogenes. Finally, as it hadbeen suggested that retroviralelements might be involved inprocessed pseudogene forma-tion, we also assayed an expres-sion vector for the Gag-Polproteins of the Moloneymurine leukaemia retrovirus10

(MoMLV), but saw no effect.

A final issue was to determine whether the LINE-mediatedretroposition activity might discriminate between different tran-scripts; that is, whether there is a preference of the LINE machin-ery for LINE transcripts. To test this, we constructed a reportergene corresponding to a complete LINE with stop codons in bothORFs to prevent protein synthesis in cis, and performed a comple-mentation assay with a LINE expression vector as above. Wedetected no difference in the retroposition frequency between theLINE-defective reporter gene and the minimal CMVneoRTreporter gene (Fig. 4). Similarly, we saw no difference with areporter gene derived from the β-globin gene with its ownpolyadenylation sequence, thus excluding any specific effect of theSV40 polyadenylation sequence used in the two former reportergenes. Thus, there is no sequence preference for mRNA retroposi-tion, and a LINE RNA is not a preferred sequence for the LINEretroposition machinery. Yet this statement is incomplete: whenthe LINE ORFs are carried by the mRNA to be retroposed itself

Fig. 3 The two LINE ORFs are necessaryfor processed pseudogene formation.a, Assay as in Fig. 2a for processedpseudogene formation with LINEexpression vectors rendered defectivefor ORF1 or ORF2, and with an expres-sion vector for the MoMLV gag-polretroviral proteins. LINE ORF2 was ren-dered defective by deletion of a 2,137-bp internal fragment encompassingthe RT domain. LINE ORF1 was ren-dered defective by an 872-bp deletion,which was in-phase so as to allowORF2 translation. LINE ORF2 wasassayed in these constructs using aPCR-based LINE-specific RT-assay per-formed under conditions of linearresponse as described10, with theresulting PCR fragments on an ethid-ium-bromide–stained gel shown in (b);the assay is not applicable (NA) to aretroviral (MoMLV) RT. c, Quantitationfor retrotransposition of the CMVneoRT

reporter gene was as in Fig. 2.

Fig. 4 Absence of sequence specificity for mRNA retroposition, but evidence for strong cis effects. Assays for processedpseudogene formation were carried out with neoRT-containing reporter genes, either the minimal CMVneoRT reportergene, the β-globin neoRT reporter gene or a reporter gene containing the LINE element rendered defective by introduc-tion of stop codons (asterisks) within its two ORFs, and complementation in trans by the LINE expression vector. Assayfor self-retroposition of the transposition-competent neoRT-marked LINE was carried out in the absence of the LINEexpression vector, and discloses very high efficiency compared with the assay for retroposition of the same RNA underconditions of complementation in trans. Quantitation was as in Fig. 2.

a b c

(7–59)

(1–14)

(0–13)

(9–109)

(25–74)

(5–60)

(160–917)

CMV L1**neoRT

(0–4)

(0–17)

clones perplate ± s.e.

clones perplate ± s.e.

retropositionfrequency ± s.e.

retropositionfrequency ± s.e.

© 2000 Nature America Inc. • http://genetics.nature.com©

200

0 N

atu

re A

mer

ica

Inc.

• h

ttp

://g

enet

ics.

nat

ure

.co

m

letter

366 nature genetics • volume 24 • april 2000

(Fig. 4), namely when following the direct retroposition of amarked and fully coding LINE, retroposition frequency is approx-imately 20-fold higher than under conditions of complementionin trans, even though the mRNA substrates to be retroposed inboth cases are close to identity (LINE neoRT without or with stopcodons in the ORFs). This demonstrates that there exists anuncommon cis effect in retroposition which does not involve thesequence per se of the RNA to be retroposed, but rather its capac-ity to encode the proteins precisely required for retroposition.

LINEs are therefore extremely versatile genome modellers(Fig. 5). First, they can transpose, via a high-efficiency processinvolving the LINE ORF products and the transcript encodingthem (cis effect). Second, they can retropose, with a reduced(but still detectable) efficiency, transcribed DNAs (trans effect).The cis effect2,4 is responsible for a large number of insertionalmutations in humans, with an estimated rate of 1 germlineinsertion in every 100 individuals11. It has also been shown thatthe cis effect might be responsible for the transfer of genomicsequences located 3´ to a functional LINE (ref. 12). This phe-nomemon, referred to as ‘exon shuffling’, relies on the poor effi-ciency of the LINE polyadenylation sequence, which results inread-through LINE transcripts and the de facto transfer of thetranscribed 3´ genomic sequence to a new location upon LINEretrotransposition12,13. Here we have demonstrated that LINEcan also act in trans, and therefore mobilize transcribedsequences not necessarily associated with a LINE element. Thisresults in the generation of retroposed copies disclosing featurescharacteristic of the naturally found processed pseudogenes.Such processed pseudogenes are common genomic structures,some of which are even expressed and fulfil new physiologicalfunctions5,6,14,15. It is worth mentioning that 3´-end trunca-tions, as observed for two retroposed copies in the present assay,are not frequent among naturally occurring pseudogenes. Theyprobably result from deletions or rearrangements of limitedextent, whose frequency may be exacerbated by a genetic unsta-bility of the cells used for the assay (cell lines in culture withenhanced level of retroposition activity versus germline cells invivo for the vertically transmitted pseudogenes). We furthershowed that there is no RNA sequence specificity for the LINE-mediated trans effect. This suggests that LINEs might also beresponsible for the mobilization of the SINE (refs 3,6) retro-transposons (among which are the human Alu sequences),which are noncoding and therefore require complementation intrans for their retroposition. Concerning the relative efficiencyof the cis versus trans LINE effects, we have shown that retropo-sition frequency is higher when the proteins necessary for

retroposition (LINE ORF1 and ORF2) are encoded by the sametranscription unit as the one which carries the indicator gene forretroposition. This is most probably due to the fact that there is adirect recognition of the LINE mRNA in the course of its trans-lation as a target for retroposition16,17. It would also account forthe observed preferential retroposition of RNA from coding,rather than mutated, LINEs, referred to as cis preference4.

Although the ongoing generation of retroposed copies hasbeen demonstrated, by both systematic studies using indicatorgenes8,18 and casual evidences19–22, direct involvement of LINEsin this process has been debated. Yet there are other RT-express-ing elements in eukaryotic genomes, among which are the LTRretrotransposons that closely resemble retroviruses3,7,23. Previ-ous studies have shown that they were not likely to be responsiblefor processed pseudogene formation. In fact, attempts to gener-ate such structures using retroviruses or retroviral-like elementsfailed to demonstrate canonical pseudogenes24–26, but generatedcDNA copies disclosing either deletions and absence of target-site duplications, thus resembling ‘transfected cDNAs’, or system-atic association with retroviral sequences (for review, see refs8,10,18). Furthermore, we had previously shown that the LINE-encoded RT had specific enzymatic properties not shared by thatof retroviruses (including MoMLV and HIV), allowing thereverse transcription of cellular mRNA with high efficiency andresulting in non-integrated cDNA copies10. Accordingly, we haveshown here that an expression vector for retroviral gag-pol genesalso fails to make processed pseudogenes, again strengthening aunique property of LINEs. This specificity is further emphasizedby the absolute requirement of LINE ORF1 for pseudogene for-mation: although the precise role of ORF1 is still unknown, itmight be involved in the formation of ‘particles’4,27,28 allowingthe re-entry of the mRNA to be retroposed into the nucleus andintegration of the reverse transcripts. A final hint for a role ofLINEs in processed pseudogene formation arises from the sys-tematic sequencing of the Saccharomyces cerevisiae genome,which provided evidence for the presence of numerous RT-expressing elements (including functional telomerases and sev-eral active LTR-retrotransposons), but the correlated absence ofLINEs, processed pseudogenes and Alu-like sequences29,30. Ourdata demonstrate that LINEs are prone to generate processedpseudogenes, and support the direct involvement of these ele-ments in the endogenous ‘retroposition’ activity found in mam-malian cells8,18 and in the generation of diversity through bothinsertional mutagenesis and genome modelling.

MethodsPlasmids. We generated expression vectors as follows. We constructedpCMV-L1 from the cloned LINE L1.2A (a gift from H. Kazazian9) asdescribed10. We constructed p220.CMV-L1 by inserting a SalI-SphI blunt-ended CMV-L1 fragment into the episomal p220.2 (HindIII, blunt-ended)vector (Clontech). An in-phase deletion in ORF1 was generated uponrestriction of pCMV-L1 with BglII and XhoI and religation after blunt-end-ing, resulting in pCMV-L1∆1. A deletion in ORF2 was generated uponrestriction of pCMV-L1 with EcoRV (deletion of the RT domain) and reli-gation, resulting in pCMV-L1∆2. We constructed p220.CMV-L1∆1 andp220.CMV-L1∆2 by replacing CMV-L1 within p220.CMV-L1 by CMV-L1∆1 and CMV-L1∆2, upon restriction with SnaBI and SalI. We construct-ed p220.CMV-MoMLV∆env and p220.CMVβ (control expression vector)

Fig. 5 Cis and trans effects of LINEs. Scheme for the effects of LINEs, with comple-mentation in trans resulting in processed pseudogene formation (and retrotrans-position of non-coding LINEs and possibly SINEs (Alu)), and complementation incis resulting in LINE transposition (and in exon shuffling12 in the case of 3´ read-through in the course of LINE transcription). The size of the arrows is meant toillustrate the relative efficiency of each process (1:20 ratio).

cis effect

trans effect

© 2000 Nature America Inc. • http://genetics.nature.com©

200

0 N

atu

re A

mer

ica

Inc.

• h

ttp

://g

enet

ics.

nat

ure

.co

m

letter

nature genetics • volume 24 • april 2000 367

by inserting an EcoRI-AseI blunt-ended fragment from pCMV-MoMLV-∆env (ref. 10) and a PstI blunt-ended fragment from CMVβ (Stratagene),respectively, into p220.2 (HindIII, blunt-ended).

We generated intron-marked reporter genes as follows. We constructedpCMVneoRT using the neoRT indicator gene7 as described8. We con-structed p220.SVphleo by inserting a BamHI-SpeI blunt-ended fragmentfrom pUT 531 (Cayla) into p220.2 (NruI and HindIII, blunt-ended).p220.CMVneoRT was constructed by inserting CMVneoRT (PvuI-AseI,blunt-ended) into p220.SVphleo (XbaI, blunt-ended). We constructedpCMVβ-glo neoRT by inserting neoRT (SalI fragment, blunt-ended) intoexon 3 (BstXI, blunt-ended) of the rabbit β-globin gene in pCMVβ-globin(gift from F. Dautry). p220.CMVβ-glo neoRT was constructed by replacinglacZ in p220.CMVβ (XhoI and SalI blunt-ended) by a HindIII-XmnI blunt-ended fragment from pCMVβ-glo neoRT. We constructed pCMV-L1 neoRT

by inserting neoRT (AccI fragment, blunt-ended) into the blunt-endedDraIII site of pCMV-L1. p220.CMV-L1 neoRT was constructed by insertingCMV-L1 neoRT into p220.CMV-L1, both restricted by SnaBI and SalI.pCMV-ORF2* was constructed by inserting a 20-mer linker with stopcodons in all six phases into the blunt-ended unique BspMI site in pCMV-ORF2. We constructed pCMV-L1*neoRT by replacing ORF2 in pCMV-L1neoRT by ORF2*, both restricted by PmlI and EcoNI. p220.CMV-L1* neoRT

was constructed by replacing CMV-L1 in p220.CMV-L1 by CMV-L1*neoRT, both restricted by SnaBI and SalI. We generated p220.CMV-L1**neoRT from p220.CMV-L1* neoRT upon introduction of a frameshift inORF1 resulting from the in-place religation of an ORF1 XhoI-XhoI frag-ment, after Klenow treatment.

Cells, transfection, nucleic-acid purification and analysis. We grew G355.5feline cells and subclones in Dulbecco’s modified Eagle’s medium (DMEM)supplemented with 7.5% fetal calf serum (Gibco), streptomycin (100µg/ml) and penicillin (100 U/ml). For selection of transformants, 106 cellsper 60-mm dish were transfected with lipofectamine (20 µl; Gibco-BRL),neoRT-marked gene (2 µg) and expression vector (2 µg). We selected trans-formants with hygromycin (150 U/ml) and phleomycin (7.5 µg/ml). G418selections were in geneticin (560 µg/ml; Gibco-BRL). We extracted cellularDNA as described7. DNA (10 µg) was restricted, electrophoresed onagarose gels and transferred to Hybond-N+ membranes (Amersham) for

Southern-blot analysis as described8. PCR amplification of the neoRT

intronic domain was performed in 50 µl (containing 10 mM Tris-HCl (pH8.3), 50 mM KCl, 3 mM MgCl2, 0.2 mM each dNTP, 1 µM each primer, 1 UTaq polymerase (Amersham) and 1 µg cellular DNA). After an initial stepat 94 °C (2 min 50), 30 cycles of amplification (90 s at 65 °C, 95 s at 72 °C,75 s at 94 °C) were carried out with primers n and t (ref. 8).

Isolation of retroposed copies and flanking sequences. We used the Uni-versal GenomeWalker Kit (Clontech) to construct libraries, according tothe manufacturer’s instructions, from a series of G418R clones (afterrestriction of genomic DNAs with EcoRV). Sequences flanking the retro-posed genes were amplified by nested PCR using primers within neoRT (n,n3, t, t2; ref. 8) and GenomeWalker-specific primers. PCR were performedas indicated by the manufacturer in 50 µl (containing 1×Buffer II, 1.1 mMMg(OAc)2, 200 µM dNTPs, 0.4 µM GenomeWalker and neoRT primers, 1µl rTth polymerase mix (Perkin Elmer) and 1 µl GenomeWalker LibraryDNA). Amplification cycle conditions included an initial step at 94 °C for30 s, followed by 7 cycles (2 s at 94 °C, 3 min at 72 °C) and 32 cycles (2 s at94 °C, 3 min at 67 °C) of amplification. Nested PCR amplifications werewith primary PCR sample (1 µl). PCR products were electrophoresed onagarose gels, eluted (Qiaquick, Qiagen), cloned in the pGEM-T easy vector(Promega) and sequenced using BigDye Terminators and AmpliTaq FS(Perkin Elmer ABI). Target DNAs were PCR amplified using the same con-ditions as above with genomic DNA from untransfected cells and primersdesigned according to the newly identified flanking sequences.

AcknowledgementsWe thank H. Kazazian for the LINE pL1.2A plasmid; F. Dautry for theCMV-β-globin plasmid; O. Dhellin for constant help and advice; L. Bénitfor help with computer searches; and C. Lavialle for comments and criticalreading of the manuscript. This work was supported by the CNRS and agrant from the ARC to T.H.

Received 1 December 1999; accepted 22 February 2000.

1. Jensen, S. & Heidmann, T. An indicator gene for detection of germlineretrotransposition in transgenic drosophila demonstrates RNA-mediatedtransposition of the LINE I element. EMBO J. 10, 1927–1937 (1991).

2. Moran, J.V. et al. High frequency retroposition in cultured mammalian cells. Cell87, 917–927 (1996).

3. Boeke, J.D. & Stoye, J.P. Retrotransposons, endogenous retroviruses, and theevolution of retroelements. in Retroviruses (eds Coffin, J.M., Hughes, S.H. &Varmus, H.E.) 343–435 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor,1997).

4. Kazazian, H.H.J. & Moran, J.V. The impact of L1 retrotransposons on the humangenome. Nature Genet. 19, 19–24 (1998).

5. Vanin, E.F. Processed pseudogenes: characteristics and evolution. Annu. Rev.Genet. 19, 253–272 (1985).

6. Weiner, A.M., Deininger, P.L. & Efstratiadis, A. Nonviral retroposons: genes,pseudogenes, and transposable elements generated by the reverse flow ofgenetic information. Annu. Rev. Biochem. 55, 631–661 (1986).

7. Heidmann, O. & Heidmann, T. Retrotransposition of a mouse IAP sequencetagged with an indicator gene. Cell 64, 159–170 (1991).

8. Maestre, J., Tchénio, T., Dhellin, O. & Heidmann, T. mRNA retroposition in humancells: processed pseudogene formation. EMBO J. 14, 6333–6338 (1995).

9. Dombroski, B.A., Mathias, S.L., Nanthakumar, E., Scott, A.F. & Kazazian, H.H.Isolation of an active human transposable element. Science 254, 1805–1808(1991).

10. Dhellin, O., Maestre, J. & Heidmann, T. Functional differences between thehuman LINE retrotransposon and retroviral reverse transcriptases for in vivomRNA reverse transcription. EMBO J. 16, 6590–6602 (1997).

11. Kazazian, H.H.J. An estimated frequency of endogenous insertional mutations inhuman. Nature Genet. 22, 130 (1999).

12. Moran, J.V., DeBerardinis, R.J. & Kazazian, H.H. Jr Exon shuffling by L1retrotransposition. Science 283, 1530–1534 (1999).

13. Eickbush, T. Exon shuffling in retrospect. Science 283, 1465–1467 (1999).14. Brosius, J. Retroposons seeds of evolution. Science 15, 753 (1991).15. Lahn, B.T. & Page, D.C. Retroposition of autosomal mRNA yielded testis-specific

gene family on human Y chromosome. Nature Genet. 21, 429–433 (1999).16. Finnegan, D.J. The I factor and I-R hybrid dysgenesis in Drosophila melanogaster.

in Mobile DNA (eds Berg, D.E. & Howe, M.M.) 503–517 (American Society forMicrobiology, Washington, DC, 1989).

17. Boeke, J.D. LINEs and Alu the polyA connection. Nature Genet. 16, 6–7 (1997).18. Tchénio, T., Ségal-Bendirdjian, E. & Heidmann, T. Generation of processed

pseudogenes in murine cells. EMBO J. 12, 1487–1497 (1993).19. Klenerman, P., Hengartner, H. & Zinkernagel, R.M. A non-retroviral RNA virus

persists in DNA form. Nature 390, 298–301 (1997).20. Weiss, R.A. & Kellam, P. Illicit viral DNA. Nature 390, 235–236 (1997).21. Gabriel, A. & Teng, S.-C. LCMV cDNA formation: which reverse transcriptase is

responsible? Trends Genet. 14, 220–221 (1998).22. Carlton, M.B., Colledge, W.H. & Evans, M.J. Generation of a pseudogene during

retroviral infection. Mamm. Genome 6, 90–95 (1995).23. Boeke, J.D., Garfinkel, D.J., Styles, C.A. & Fink, G.R. Ty elements transpose through

an RNA intermediate. Cell 40, 491–500 (1985).24. Dornburg, R. & Temin, H.M. cDNA genes formed after infection with retroviral

vector particles lack the hallmarks of natural processed pseudogenes. Mol. Cell.Biol. 10, 68–74 (1990).

25. Levine, K.L. et al. Unusual features of integrated cDNAs generated by infectionwith genome-free retroviruses. Mol. Cell. Biol. 10, 1891–1900 (1990).

26. Derr, L.K., Strathern, J.N. & Garfinkel, D.J. RNA-mediated recombination inS. cerevisiae. Cell 67, 355–364 (1991).

27. Martin, S.L. Ribonucleoprotein particles with LINE-1 RNA in mouse embryonalcarcinoma cells. Mol. Cell. Biol. 11, 4804–4807 (1991).

28. Hohjoh, H. & Singer, M. Cytolasmic ribonucleoprotein complexes containinghuman LINE-1 protein and RNA. EMBO J. 15, 630–639 (1996).

29. Sandmeyer, S. Targeting transposition: at home in the genome. Genome Res. 8,416–418 (1998).

30. Kim, J.M., Vanguri, S., Boeke, J.D., Gabriel, A. & Voytas, D.F. Transposable elementand genome organisation: a comprehensive survey of retrotransposons revealedby the complete Saccharomyces cerevisiae genome sequence. Genome Res. 8,464–478 (1998).

© 2000 Nature America Inc. • http://genetics.nature.com©

200

0 N

atu

re A

mer

ica

Inc.

• h

ttp

://g

enet

ics.

nat

ure

.co

m