Show simple item record

dc.contributor.authorZhan, Luyao
dc.date.accessioned2016-05-25T12:32:55Z
dc.date.available2016-05-25T12:32:55Z
dc.date.issued2016-05-25T12:32:55Z
dc.identifier.urihttp://hdl.handle.net/10222/71706
dc.description.abstractBiodiversity conservation plays an important role in the maintenance of a healthy ecosystem. Genetic diversity provides a foundation for understanding the diversity at the organism and population levels of organization. Genomic DNA markers offer the opportunity to identify genetic variations that distinguish populations, and can be used to investigate the underlying forces that drive adaptation to different environments. Short simple-repeat DNA sequences or microsatellites are one of the most popular genetic markers for many biological applications. However, microsatellite data require extensive manual checking for errors and characteristic signals, a laborious process that can take days or weeks for a single dataset. We have developed MEGASAT, a bioinformatics approach that automates microsatellite genotyping from DNA sequence data. MEGASAT uses fuzzy matches and counting of frequently observed sequences to distinguish true genotype signal from errors. We validated MEGASAT using microsatellite data from a population sample of 71 guppies from Trinidad, demonstrating a high level of reproducibility and accuracy of MEGASAT-called genotypes by a combination of genotyping error estimation methods. We also developed a random-forest (RF) based method to identify adaptive gene variants and environmental factors associated with those adaptive variants in sea scallop data. Our approach uses the inverse Cholesky transformation to account for spatial autocorrelations in genetic and environmental data and ordination techniques to further explore the relationships between these two data sets. The variable importance ranked by RF models and ordination techniques were both used on corrected and uncorrected data to find which environmental variables play important role in shaping the genetic structure of sea scallop populations.en_US
dc.language.isoenen_US
dc.subjectmicrosatellite genotypingen_US
dc.subjectenvironmental associationsen_US
dc.titleINFERRING ECOLOGICAL POPULATION STRUCTURE AND ENVIRONMENTAL ASSOCIATIONS THROUGH AUTOMATED ANALYSIS OF REPEAT-CONTAINING AND POLYMORPHIC DNA SEQUENCESen_US
dc.date.defence2016-05-03
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeMaster of Computer Scienceen_US
dc.contributor.external-examinern/aen_US
dc.contributor.graduate-coordinatorNorbert Zehen_US
dc.contributor.thesis-readerChristian Blouinen_US
dc.contributor.thesis-readerDaniel Ruzzanteen_US
dc.contributor.thesis-supervisorRobert Beikoen_US
dc.contributor.thesis-supervisorPaul Bentzenen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNoen_US
dc.contributor.copyright-releaseNoen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record