Show simple item record

dc.contributor.authorBong, Fabian
dc.date.accessioned2024-08-16T14:30:30Z
dc.date.available2024-08-16T14:30:30Z
dc.date.issued2024-08-15
dc.identifier.urihttp://hdl.handle.net/10222/84419
dc.description.abstractSince the turn of the 21st century, research in the biomedical sciences has dramatically shifted and become reliant on high throughput bioanalytical measurements for investigating bio-molecular events that are presumed to be self-orchestrated and organized in hierarchies. These measurements yield large datasets which, consequently, require specialized statistical (bioinformatics) approaches with which to extract the most meaningful information given observational or designed experiments. At the outset, data acquisition outpaced model development, and ad-hoc approaches were used to glean the most obvious information. With the maturity of bioinformatics, important questions are beginning to emerge regarding the extent to which significant biological information can be extracted from these data given challenges such as: (a) their divergent distributional characteristics, (b) the large dynamic range in their signals, and (c) allowable effect sizes given limitations in sample size and associated variability. Here, two approaches are introduced to overcome these challenges. First, kurtosis-based projection pursuit, augmented with classification and regression trees (kPPA-CART) is proposed as a robust, easy-to-implement approach to model multi-omics data that are derived from next-generation sequencing (NGS) and mass spectrometry (MS). Most of the available methods for unsupervised multi-omics integration suffer from the inability to model low-intensity (low count) features and instead focus on highly variable (dominant) features. Comprehensive benchmarking of existing multi-omics integration tools against kPPA-CART was performed using simulated data where the changes involved in a hypothetical biological phenomenon are associated with low-intensity signals and small effect sizes. The results show that kPPA-CART provides a superior recovery of this information. The application of this method is supported by the development of an R Package (https://github.com/FabianBong/KPPACart) and an easily accessible web tool (https://intmove.vercel.app/) that allow experimentalists to implement kPPA-CART without the need for computational training. Second, to the extent that measurement uncertainties affect data analysis strategies, distributional assumptions accompanying many methods for -omics data analysis assume an independent, identical, and normally distributed (iid normal) structure for the noise. When this assumption is violated, one practical solution is to incorporate the true structure of the measurement error variance in the analysis. However, this requires extensive replication which can become prohibitive. Here, two approaches (Frequentist and Bayesian) are introduced for developing a parametric estimate of error variance incorporating shot and proportional noise for LC-MS data using, as a base, empirical replicate measurements. This thesis provides evidence that both methods accurately recapitulate the parameters of the variance function while accounting for sensitivity differences between replicate samples, and enable test statistics from the exponential family of distributions to be conducted without loss of generality.en_US
dc.language.isoenen_US
dc.subjectkPPAen_US
dc.subjectLow-Intensity Signalsen_US
dc.subjectMeasurement Uncertaintyen_US
dc.subjectMulti-Omicsen_US
dc.subjectBayesian Statisticsen_US
dc.titleAdvanced Strategies for Modeling Low-intensity Signals from Multi-omics Dataen_US
dc.date.defence2024-08-09
dc.contributor.departmentDepartment of Computational Biology and Bioinformaticsen_US
dc.contributor.degreeMaster of Scienceen_US
dc.contributor.external-examinerN/Aen_US
dc.contributor.thesis-readerAaron MacNeilen_US
dc.contributor.thesis-readerPeter Wentzellen_US
dc.contributor.thesis-supervisorTobias Karakachen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record