Show simple item record

dc.contributor.authorHayes, Mary Gunn
dc.date.accessioned2020-05-15T17:41:02Z
dc.date.available2020-05-15T17:41:02Z
dc.date.issued2020-05-15T17:41:02Z
dc.identifier.urihttp://hdl.handle.net/10222/79209
dc.description.abstractMicro-organisms seem to flagellate about wherever they please, in our bodies and in the natural and built environments, but they are more cunning than their meandering behavior would suggest. By creating networks of biochemical pathways, communities of microbes are able to modulate the properties of their environment and even biochemical processes within their hosts. Next-generation high-throughput sequencing has led to a new frontier in microbiology and microbial ecology which promises the ability to leverage the microbiome for good in every facet of our lives, and the stakes are high as global society hurtles toward several apocalyptic ecological crises. However, along with the fascinating complexity of microbial community dynamics comes equally complex data considerations for researchers: genomic data are high-dimensional, sparse, noisy, and refuse to cooperate with authorities. In fact, they will not even cooperate with each other, which prohibits the sorts of consensus-based validation and meta-analysis that we rely on in science. In this thesis we propose an ensemble approach for cross-study exploratory analyses of microbial abundance data, in which we first estimate the variance-covariance matrix from each dataset assuming Poisson sampling, and subsequently model these covariances jointly so as to find a shared low-dimensional subspace of the feature space. By viewing the projection of the latent true abundances onto this common structure, the variation is pared down to that which is shared among all datasets, and is likely to reflect more generalizable biological signal than can be inferred from an individual dataset. We investigate several ways of achieving this, and demonstrate that they work well on simulated and real metagenomic data in terms of signal retention and interpretability.en_US
dc.language.isoenen_US
dc.subjectMicrobiomeen_US
dc.subjectBatch effectsen_US
dc.titleCross-Study Analyses of Microbial Abundance Using Generalized Common Factor Methodsen_US
dc.date.defence2020-05-08
dc.contributor.departmentDepartment of Mathematics & Statistics - Statistics Divisionen_US
dc.contributor.degreeMaster of Scienceen_US
dc.contributor.external-examinerN/Aen_US
dc.contributor.graduate-coordinatorDr. Joanna Mills Flemmingen_US
dc.contributor.thesis-readerDr. Tobias Kenneyen_US
dc.contributor.thesis-readerDr. Lam Hoen_US
dc.contributor.thesis-supervisorDr. Hong Guen_US
dc.contributor.thesis-supervisorDr. Morgan Langilleen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record