Designing and Developing Interactive Big Data Decision Support Systems for Performance, Scalability, Availability and Consistency

Burke, Neil

dc.contributor.author	Burke, Neil
dc.date.accessioned	2021-04-29T16:45:26Z
dc.date.available	2021-04-29T16:45:26Z
dc.date.issued	2021-04-29T16:45:26Z
dc.identifier.uri	http://hdl.handle.net/10222/80446
dc.description.abstract	Big data decision support systems are used to interpret meaning from extremely large data sets. The users of such systems rely on decision support systems to provide short, human-readable summarizations to aid the user in the decision making process. An interactive big data decision support system must do all of this within seconds of a user request. This short response window promotes interactivity between the system and its user, enabling the user to make several ad hoc or follow-up queries to the system shortly after receiving a response. In this thesis, we explore the design and development of interactive big data decision support systems that satisfy four key useful characteristics: performance, scalability, availability and consistency. We do this within the context of two applications. We first design and develop a novel interactive reinsurance portfolio analytics system. Our system runs on a cloud architecture and efficiently distributes work to achieve excellent scalability, scaling up to thousands of cores. In order for our system to be highly performant, we design our system to process all data entirely in memory. Our system is made consistent by a decentralized data storage service that guarantees strong consistency for all input data. A queuing system that automatically retries failed tasks ensures that the system is highly available. In a comparison with one of the leading commercial portfolio analytics systems, our system performed approximately 50 times faster. Later, we further improve performance by caching intermediate results between portfolio analyses, allowing extremely complex location-level analytics queries to be processed in only 11 seconds. Without caching, the same queries would have to process hundreds of millions of transformations over terabytes of data. Our second application is Online Analytical Processing (OLAP), where we focus solely on data consistency. We describe a method for quantifying consistency in distributed OLAP systems and present a corresponding Monte Carlo simulation to approximate the level of consistency for quorum-replicated OLAP systems, allowing users to explore their system's level of consistency under different usage scenarios. In a case study, we validate the accuracy of our simulation on a real, interactive OLAP system.	en_US
dc.language.iso	en	en_US
dc.subject	big data	en_US
dc.subject	high-performance computing	en_US
dc.subject	decision support	en_US
dc.title	Designing and Developing Interactive Big Data Decision Support Systems for Performance, Scalability, Availability and Consistency	en_US
dc.date.defence	2021-04-22
dc.contributor.department	Faculty of Computer Science	en_US
dc.contributor.degree	Doctor of Philosophy	en_US
dc.contributor.external-examiner	Dr. Owen Kaser	en_US
dc.contributor.graduate-coordinator	Dr. Milios Evangelos	en_US
dc.contributor.thesis-reader	Dr. Andrew Rau-Chaplin	en_US
dc.contributor.thesis-reader	Dr. Qigang Gao	en_US
dc.contributor.thesis-supervisor	Dr. Norbert Zeh	en_US
dc.contributor.thesis-supervisor	Dr. Oliver Baltzer	en_US
dc.contributor.ethics-approval	Not Applicable	en_US
dc.contributor.manuscripts	Not Applicable	en_US
dc.contributor.copyright-release	Not Applicable	en_US

Find Full text

Files in this item

Name:: NeilBurke2021.pdf
Size:: 1.768Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Faculty of Graduate Studies Online Theses

Show simple item record