ASmallTripintheUntranquilWorldofGenomesAsurveyonthedetectionandanalys

Please download to get full document.

View again

of 22
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Others

Published:

Views: 1 | Pages: 22

Extension: PDF | Download: 0

Share
Related documents
Description
Download ASmallTripintheUntranquilWorldofGenomesAsurveyonthedetectionandanalys
Transcript
ASmallTripintheUntranquilWorldofGenomesAsurveyonthedetectionandanalysisofgenomerearrangementbreakpointsClaireLemaitre1;2;andMarie-FranceSagot1;2January1,20071EquipeBAOBAB,LaboratoiredeBiometrieetBiologieEvolutive(UMR5558);CNRS;Univ.Lyon1,43bddu11nov1918,69622,VilleurbanneCedex,France.2ProjetHelix,INRIARh^one-Alpes,655avenuedel'Europe,38330MontbonnotSaint-Martin,FranceCorrespondingauthor(clemaitr@biomserv.univ-lyon1.fr)AbstractGenomesaredynamicmoleculesthatareconstantlyundergoingmutationsandrearrange-ments.Thelatterarelargescalechangesinagenomeorganisationthatparticipateintheevolutionaryandspeciationprocess,butmayalsobeinvolvedininheriteddiseasesandincan-cer.Theyhavesincelongbeenstudiedbythebiologistswhereascomputationalbiologistshavesincemorerecentlyonlybeenattractedtothetopic.Oneofthe(exciting)objectivesforstudyingrearrangementsistounderstandtheunderlyingmolecularmechanismsofevolution.Onepossiblelineofinvestigationistoanalyse,atthesequencelevel,theregionswhichhaveundergonearearrangement,assumingweareabletoverypreciselylocatethem.Thispaperpresentsasurveyofthedi erentmethodsthathavebeendevelopedtoidentifysuchregions,inparticulartheapproachesthatarebasedonthealignmentofgenomes.Themainpurposeofthepaperisthentoinvestigatewhatiscurrentlyknownaboutthecharacteristicsoftheregionswherearearrangementtookplace,andaboutthemechanism(s)havingledtosuchlargescalechanges.keywords:Genomedynamics,rearrangement,breakpointregion,wholegenomealignment,conservedsegment,syntenyblock1IntroductionFromarelativelymarginaltopicwhenothers,likesequencealignmentforinstance,wereinfullbloomintheearlyyearsofcomputationalbiology,genomedynamicshasevolvedintoanincreasinglymoreactiveareaofresearch.Theareahasgrownalsoinsophisticationalthoughthemodelsusedremainingeneralbiologicallyunrealistic.Indeed,thisisanareawherethegapbetweenwhathasbeendonebythecomputationalbiologists,andwhathaslongbeenknownorisbelievedtobetrueThenletusclearawaythechoakingthorns/Fromrounditsgentlestem;lettheyoungfawns,/Yeanedinaftertimes,whenweare\rown,/Findafreshswardbeneathit,overgrown/Withsimple\rowers:lettherenothingbe/Moreboisterousthanalover'sbendedknee;/Noughtmoreungentlethantheplacidlook/Ofonewholeansuponaclosedbook;/Noughtmoreuntranquilthanthegrassyslopes/Betweentwohills.Allhaildelightfulhopes!{fromThePoeticalWorksofJohnKeats.1 bythebiologistsisperhapsthegreatestinthe eld.Algorithmicistsinparticular,andamongthemthosecomingfromacombinatorialbackground,havelovedtheprobleminitsinitialformulationsbecauseofitscloserelationtoconceptslongfamiliartothem,suchaspermutations,andbecauseofthesimplicityofthequestionsonecanask.Thiswasthusatopicwhereitseemedpossibletomakeimportantcontributionswithouthavingtogettoodeeplyintotheunderlyingbiologicalcomplexity.Thisis,ofcourse,nottrue,atleastnotanymoreassoonasonestartswantingto\interpret"theresultsobtainedortousethemtofurtherourknowledgeon,forinstance,evolution.Thisissueisoneofthecausesofsomeveryrecentpolemicsconcerninggenomedynamics.Partofthepolemics(forinstance,surroundingtheissueoftheexistenceornotof\hotspots"{regionsalongagenomethataremoresusceptibletobethelociofrearrangements[?][?][?][?][?])haveinvolveddi erentgroupsofcomputationalbiologists.Others,suchasexampli edintheMarchissueofGenomeResearch[?][?],haveinvolvedcomputationalbiologistswithbiologistsworkingwithdata(comingfromcytogenetics1)thatpre-existedthesequencingofwholegenomes.Ourpurposewiththissurveyisnottoparticipateourselvesinthosepolemics,nortoexplorealltheaspectsbehindgenomedynamics.Indeed,awholebookwouldnotbeenoughforthis.Weshallconcentrateinsteadontwoquestions.The rstisdetectingthebreakpoints,thatistheexactpointsalongagenomewherearearrangementhastakenplace,intheorganismunderstudyorinthehomologousgenomelocusofanotherorganism.Thesecondquestionconcernstheanalysisoftheregionsaroundbreakpoints.Infact,the rstquestion,detectingthebreakpoints,whichseemsverysimple,isahardone,and,tothebestofourknowledge,hasneverbeenaddressedinthisverypreciseway.Manyap-proximationshavebeenmadeinthesensethatallmethodsthatdetectgenomesegmentsconservedamongdi erentorganismsaretryingtodelimitamoreorlesswideregionaroundpossiblebreak-points.Partofthispaperwillbeasurveyofsuchmethods,andofmethodsdevelopedforanotherpurposebutthatcouldbeusedtoidentifyconservedsegments,andthus,theirduals,thatisregionsthatatsomepointwere\broken".Itisironic,andsatisfying,thatthiswilltakeusbacktotheverybeginningsofcomputationalbiology:sequencealignments!Thescalesthougharenotatallthesameanymore.Whyaninterestforpreciselydetectingbreakpoints?Onemainmotivationdoesleadusbacktosomeofthepolemicswealludedtoabove: nelyanalysingtheregionsaroundbreakpointscouldgiveussomecluesontheissueofhotspots.Beyondthat,itcouldhelpusbothtogetatabetterunderstandingofthepossiblemechanismsbehindrearrangements,andtoidentifywhichhaveindeedhappened.Thisinturncouldhelpimprovethemodelsforcomparinggenomes,derivingpossibleancestorsand,ultimately,understandingthecourseofevolutionanditsfunctionalimpact.The neanalysisoftheregionsaroundbreakpointswiththeaimofbetterunderstandingtheunderlyingmechanismsthathaveleadtothebreakswillthusbethesecondmainconcernofthissurvey.Tosimplifymatters,weshallconcentrateourattentionontheevolutionofmammalsonly.Thegenomesofotherspeciesareasmobilebutprobablypresentadi erentdynamic.Muchisalreadyatleastpartiallyknownabouttherearrangementsthatarepossibleandabouttheirunderlyingmechanisms,bothfroma\purebiology"pointofview,andthroughsomeinitialcomputationalstudiesthatweredoneinthepast,atasmallorlargerscale,andareappearingwithanincreasingfrequency.Muchmoreisnotknown.Findingone'swayinthewrittenororalliteraturetopiecealltheinformationtogether,orjusttopreciselyidentifywhatisandwhatisnotknown,is,however,liketryingto ndasetofneedlesdispersedinsidethousandsofhaystacks.Thispaperwillthereforehavethepretenceonlytoserveasaninitialkickintotheinvestigation.1Cytogeneticsisthebranchofbiologythatdealswithheredityandthecellularcomponents,particularlychromo-somes,associatedwithheredity.2 Wehopethatithelpsatleasttoshowthattheissueisevenmorecomplicatedthanalreadythought,andfarmorefascinating.Thepaperisorganisedasfollows.Westartbygivingageneralintroductiontogenomedynam-ics,includingquicklypresentingsomeoftheexperimentaltechniquesusedtoidentifyandstudyrearrangements.Wethenexploreanddiscussthemethodsdevelopedwiththepurposeofdetectingbreakpoints(ortheirclosecousins,conservedsegments),andthosemethodsthatcouldbehijackedtodothat.Wethengettotheheartofthispaper,whichisasurveyofwhatisknown,throughgenomicapproaches,aboutpossiblerearrangementmechanisms.Weendwithageneraldiscussionandsomeopenquestions.2BiologicalbackgroundTheexpression\genomedynamics"referstothestructuralvariationsobservedingenomesalongthecourseofevolution.Besidespunctualmutations,genomesthusundergolargescalechangesthathavebeencalledrearrangements.Theseinvolvepartsofthegenomethatmaybeofvaryinglength,fromseveralkilobasestoentirechromosomes.Severaltypesofrearrangementsarealsotobedistinguished:inversion,duplicationanddeletionofasegmentinsideachromosome,transposition,reciprocaltranslocationwhichistheexchangeoftwosegmentsbetweentwochromosomes, ssionwhichisthebreakageofonechromosomeintwoandfusionoftwochromosomes,thatis,theirjoiningintoone.Suchrearrangementsplayanimportantroleinevolutionandspeciationalthoughithasbeenobserved[?]thatnotallrearrangementscreateaspeciesbarrieraswasoriginallybelieved.Indeedgenomicstructuralpolymorphismshavebeenobservedinindividualswithinasamepopulation[?].Rearrangementshavehoweveroftenbeenassociatedwithgenomicdisorders[?]andhavethereforebeenwellstudiedbybiologistsforalongtime,withtheaiminparticularofunderstandingtheirunderlyingmolecularprocesses.ItislargelyacceptedthatmostrearrangementsareinitiatedbyoneorseveralDoubleStrandBreak(s)(henceforthdenotedbyDSB).ADSBisabreakthatcutsatasamepositionthetwostrandsofaDNAmolecule,asopposedtoaSingleStrandBreak.Suchlesionsarenotrareandmaybeinducedbyvariousfactors,forinstance,byReactiveOxygenSpecies(oxygenions,freeradicalsandperoxides),ionizingradiation(Xandgammarays),replicationacrossanick,andsoon.Insomespeci ccases,DSBsmayalsohappeninavoluntaryandprogrammedmanner,takingpartinamorecomplexmolecularmechanismprocess,forinstanceDSBsmaybegeneratedbyspeci cenzymesduringV(D)Jrecombinationinlymphocytes,orinthecrossing-overprocessduringmeiosis.AV(D)Jrecombinationisaprocessthatparticipatesinimmunecellprotection.Itgeneratesvariablityintheimmuneresponsemolecules,whichisessentialforthecellbecauseitenablestherecognitionofagreatnumberof\foreign"entitiesintheorganism.StartingfromanoriginalsetofDNAsegments,aV(D)Jrecombinationgeneratesdi erentcombinationsamongthemthankstosite-speci cDSBs.Recombinationingeneral,notjustV(D)J,isacomplexandwelldocumentedmechanisminmolecularbiology.ItallowstheexchangeofDNAsegmentsbetweentwoDNAmolecules(ortwopartsofasinglemolecule).Theprocessisinitiatedbynucleotidepairingbetweenthetwomolecules,thusastretchofsequencesimilarityisneeded.Thenthetwomoleculesareintertwined,andthisleadstoacomplexconformationcalledtheHollidayjunction,whichcanberesolvedbytheexchangeofsegmentsbetweenthetwomolecules.Asasimpli edexample,arecombinationbetweentwomoleculesABandA'CatalocusAcanleadtothemoleculesACandA'B.Asconcernsmeiosis(gametegenerationstep),DSBsmaybeinvolvedinitthroughgeneticrecombination(orcrossingover).Thelatterplaysacrucialroleinthegenerationandmaintenance3 ofgeneticdiversitybyshuingallelesbetweenhomologouschromosomes.Intheabovecases,DSBsappearusefulforthecellbuttheyarealwaysasourceofseriousdamageiftheyarenotrepairedbecausethegenomicintegrityofthecellisendangered.Indeed,asingleDSBmaybesucienttostopthecellcycle.ContrarytoSingleStrandBreaksthatcaneasilyberepairedusingastemplatetheunbrokenstrand,therepairofaDSBrequiresmorecomplexmolecularmechanisms.Atleasttwosuchrepairprocessesarelargelydescribedintheliterature.TheyarecalledNonHomologousEndJoining(denotedbyNHEJ)andHomologousRecombination(HR).The rstoneisabiochemicalprocessinthesensethattherepairisdoneregardlessoftheinitialDNAinformation.Itconsistsinjoiningthetwobrokenends,butthiscoststhelossofsomepartsoftheDNAmoleculeatbothextremities[?]..Onthecontrary,thesecondmechanism,HomologousRecombination,ismoreconservative.Wemaycallitageneticmechanism:itrestorestheinjuredgeneticinformationbyusingasimilarone.Thissimilargeneticinformationcomesfromthechromosomehomologoustotheonebrokenthatisthusemployedasatemplatetorepairthebrokenchromosome.Thisrepairmechanismisbasedonarecombinationprocess[?].Themaindi erencebetweenNHEJandHRisthatHRrequireslongstretchesofsimilaritywhileNHEJmayinvolvesequencesimilaritybutofshortersequences.ThisisthereasonwhyNHEJisalsocallednon-homologousrecombination.Thechoicebetweenthetworepairmechanismsseemsclearlydetermined:itdependsontheDSBoriginandonthestateofthecellrelativelytothecellcycle[?,?].Rearrangementsoccurwhentherepairmechanismfails,orwhenitmakesamistake.NHEJislikelytomisrepairwhentwo(ormore)DSBsoccursimultaneouslyonagenome.Thejoiningofbrokenendsthatdonotcomefromthesamebreakpointwouldinthiscasegeneratearearrangement.Forinstance,iftwoDSBsoccuronthesamechromosomeandtheNHEJmisjoinsthebrokenends,thenthesegmentbetweentheDSBswillbereversed.TheHRmechanismcanerrifawrongtemplateisused.Indeed,onlysequencesimilarityisneededtoinitiatetherecombination,andifthesequenceusedastemplateisnotorthologous(comingfromasameancestorthroughspeciation),morepreciselyifitisnotatthesamelocusonthehomologouschromosome,itmaygeneratearearrangement.Indeediftherecombinationoccursbetweenorthologousloci(allelicrecombination),anexchangeofDNAmayhappenbutthegenomicorganisationwillnotbealteredsincethelocalisationoftheexchangedmaterialdoesnotchangeonthechromosome,whereasrecombinationbetweendi erentlociwillleadtochangesingenomicorganisation:theexchangedmaterialwillnolongerbeattheoriginalloci.ThislatterprocessiscalledNon-AllelicHomologousRecombination(NAHR).Ithasbeenshowntobethemechanismresponsibleforseveralhumangenomicdisorders(reviewedin[?,?,?]),particularlywhenitleadstowhatiscalledunbalancedrearrangement,whichisarearrangementleadingtothegainorlossofDNA.ContrarytoHR,NHEJ,asfarasweknow,hasrarelybeenimplicatedinevolutionaryordiseaserearrangements.Weassumethereasonisthatthismechanismhasleftnotraceofitsoccurrence(ornoneyetdetected).Nevertheless,itisgenerallyadmittedthatNHEJcangeneraterearrangements;forinstance[?]hasexperimentallydeterminedthefrequencyoftranslocationsgeneratedbyNHEJ(lessthan3%).Finally,tobeviable,arearrangementmustsustainthedi erentstepsofthecellcycle,suchasreplicationandmitosis.Moreover,tobetransmittedtotheo springs,arearrangementhastooccurinthegermline,andtosuccessfullypassthemeiosisstep.Thisisabiologicallydelicateanddicultstepbecausemeiosiscancompleteonlyifthechromosomesarecorrectlysegregated.Itisknown,forinstance,thatsometranslocationsarenotpossiblebecausetheypreventtherightsegregationofchromosomes[?].Further,arearrangementhastobeselectedand xedinthepopulation,whichmeansthatithastoprovidesomeselectiveadvantage.Arearrangementmayalsobepolymorphic.Polymorphismisaconditioninwhichapopulationpossessesmorethanonealleleatalocus.Theremaybeseveralcauseswhypolymorphicrearrangementsisobserved.For4 instance,theycanbemaintainedbyabalancebetweenvariationandnaturalselection,orbecausesomeheterozygousadvantageisconferredoverindividualswhohavetwocopiesofthewildtypeallele.Ifselectionisoperating,migrationcanalsointroducepolymorphismintoapopulation.Multiplenichepolymorphismexistswhendi erentgenotypesshouldhavedi erent tnessesindi erentniches.Geneticdriftisafurtherpossiblesourceofgeneticvariation.3DetectingbreakpointsTostudyrearrangements,theonlydataavailablearetheactualarrangementsofthegenomes.Toreconstructtherearrangementscenariosthathaveoccurredsincethedivergenceoftwogenomes,the rststepistoidentifytheregionsofthegenomesthathavenotbeenbroken,thatis,theconservedsegments.Wemayassume,bytheparsimonyhypothesis,thattheymustderivefromthesameregioninthegenomeoftheirclosestcommonancestor.3.1ExperimentalmethodsVariousexperimentalmethodshavebeendevelopedtoanalysekaryotypesandidentifyconservedsegments.Akaryotypeisthecompletesetofallchromosomesofacellofanylivingorganism.Itisascreenshotofthechromosomesofagenome.The rst,andmostintuitive,approachdevelopedwastocomparekaryotypesofseveralorgan-ismsorindividuals.Bythismeans,itwaspossibleonlyto\see"thedi erencesinnumberorinsizeofchromosomes.Then,inthe1970s,atechniquecalledchromosomebandingappearedthatenabledtheidenti cationofrearrangementsata nerscale.Usingsomecolorationsolutions,thistechniqueallowsthedi erentiationofseveralkindsofbandsonchromosomes(thesizeofabandisroughly4Mb).Therefore,achromosomecanbecharacterisedbyitsbandpattern.Thisallowsthecomparisonofthekaryotypesfromdi erentspeciesbasedonsuchpatterns.Theresolutionremainshoweverlow,andonlymajorrearrangementscanbeidenti edwiththismethod,likechromosomefusionsor ssions,largetranslocationsandinversions.Moreover,thebandscanbemisleadingwhentheconsideredspeciesarenotcloselyrelatedbecausetheassignmentofhomologousbandsbecomesotherwisetoodicult.Theninthe1990s,amajortechniqueinthe eldofcytogeneticswasdevelopedbasedontheprincipleofhybridization:FluorescentInSituHybridization(FISH)[?,?].Brie\ry,hybridizationisamolecularprocessthatjoinstwocomplementarysinglestrandDNAmoleculestoformonedoublestrandDNAmolecule.FISHusesthisprocesstolocateprobes,thatissinglestrandDNAsegmentsmarkedby\ruorescence,ontargetedsequences.Forinstance,FISHallowsthedetectionofallthechromosomesofonespeciesthatshareatleastoneconservedsegmentwithaspeci cchromosomeofanotherspecies.Theresolutionremainslowbecausethe\ruorescentsignalcannotbedetectedifthe\ruorescencedoesnotcoverasucientlylongsequence.Dependingonthecondensationlevelofthechromatineanalysed,resolutionvariesfrom50kb(chromosomeatinterphase)to3Mb(chromosomeatmetaphase).Thismethod,calledcomparativechromosomepainting,andfurtherextendedtodealwithmoredistantlyrelatedspecies(zoo-FISH),allowstheidenti cationmainlyofinter-chromosomalrearrangements.Ithasbeenusedtoidentifysuchrearrangementsbetweenagreatnumberofspecies[?,?].Arraycomparativegenomichybridization(arrayCGH,alsodenotedbyarray-basedCGH)isanothermorerecentlydevelopedmolecular-cytogeneticmethodthatallowstodetectsometypesofchromosomalchanges,unbalancedonesonly,notbalancedreciprocaltranslocationsnorinversions.Inparticular,itisbeingextensivelyusedtoanalysecopynumberchanges(losses,gainsandam-pli cations)intheDNAcontentofcells.ThetechniqueisderivedfromconventionalCGH,which5 enablestocharacterisebothsomaticandconstitutionalgenomicDNAmutations.InconventionalCGH,theDNAofinterestandareferenceare\ruorescentlylabelledandhybridizedtoanormalmetaphasepreparation.Inarray-basedCGH,largeinsertcloneslikeBACsandPACs,containinghumanDNA,replacethemetaphasepreparationastarget.Usingmicroscopyandquantitativeimageanalysis,regionaldi erencesinthe\ruorescenceratiooftheDNAofinterestversustheref-erencecanbedetectedandusedforidentifyingabnormalregionsinthe rst.BothCGHandarrayCGHdonothoweverprovideinformationastothepreciselocationoftherearrangedsequences.Anotherexperimentaltechnique,calledgenemapping,isalsousedtostudyrearrangements.ItprecededFISHorgenomesequencing.Theaimistoexperimentallylocatethegenesonagenomewithrespecttooneanother.Genemappingisusuallydonebytwomaintechniques:linkageanalysisandradiationhybridmapping.Theideaoftheformeristhatiftwolociare\linked",theyareinheritedtogether.Therelativedistancebetweentwogenesmaythusbeapproximatedbyestimatingthefrequencyatwhichtheyareobservedtobesimultaneouslyinherited,assumingthatthedistributionofcrossingoversisuniformalongthegenomes.Radiationhybridmapsareobtainedbyirradiationofthestudiedgenomebeforefusioninothercelllines.Theirradiationcutsthechromosomeindi erentfragments,whichareindependentlykeptorlostduringthecelllife(culture).Thedistancebetweentwomarkerscanthenbeestimatedusingthefrequencyatwhichtheyarefoundtogether,assumingthatthecloserthemarkersare,thelesstheyareseparatedbyirradiation.Studyingrearrangementsusinggenemappingdata,consiststhenincomparinggeneorders.Ofcourse,onlythoserearrangementsthatinvolveagenecanthusbestudied,andthetechniquerequiresagoodidenti cationofthegenes,andoftheorthologsbetweenspecies.Finally,anothernoveltechniqueappearedrecently,called\end-sequencingpro les"(ESPforshort).ItconsistsincloningBAC-sizedpartsofthegenomeofinterestandsequencingonlytheextremities.Thelatteraremappedonareferencegenome,mainlyusingsequencealignment.ThespacebetweentwocorrespondingendsarethencomparedwiththenormalsizeofaBACsequence.Ifthedistanceistoodi erent,itmeansthatatleastonerearrangementdistinguishesthegenomeofinterestfromthereferenceone.Thistechniqueismainlyusedoncancerorpolymorphismdatabecauseitischeaperthansequencingthewholegenomeofonterest.Itdoeshoweverrequirethatthereferencegenomeiswhollysequenced.Inthissurvey,fromnowon,weconcentrateonlyonthegenomicmethodstoidentifycon-servedsegments.Suchmethodsarebasedonthealignmentofwholegenomes.Thisenablestheidenti cationoftheconservedsegmentsbetweentwogenomesata nerscalethanusingFISHorsimilarmethodswhichcaningeneralnotdetectbandssmallerthanafewmegabasesinsize.FISHdoesnotalloweithertodetectintrachromosomalrearrangements.Genomicmethodsarealsomoreprecisethangenemappingwhichfurtherreliesonorthologousassignationsthatareoftenerror-prone.However,itpresentstheinconvenienceofbeingapplicableonlywhengenomeshavebeenwhollysequenced,andtherearelessofthosethanofgenomestowhichFISH-liketechniqueshavebeenapplied.Furthermore,comparinggenomicsequencesisnotatrivialproblem.Forinstance,whereashybridsobtainedbyFISHmaybeconsidered(directly)astruehomologs,likeinthecaseofgenemapping,genomicalignmentsdonotallowforafullyreliableidenti cationofhomology,orworse,oforthology.Theidealthen,asarguedin[?,?,?],wouldbetousebothtypesofdatasimultaneously,somethingthathasrarelybeendoneuptonow.3.2GenomicalignmentTherearetwomaintypesofalignmentalgorithms:localandglobal.Globalalignmentalgorithms, rstdescribedin[?,?],seektoaligntwosequencesfromthebeginningtotheendofeach.This6 thereforerequiresthetwosequencestobewellconservedwithnochangesintheorderandorienta-tionofanyoftheirsegments.Ontheotherhand,localalignmentalgorithms,the rstofwhichisduetoSmithandWaterman[?] ndthesegmentsofeachsequence
Recommended
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x