Towards a Label-Free and Representation-Based Metric for Evaluating Machine Learning Models

Xu, Isaac

dc.contributor.author	Xu, Isaac
dc.date.accessioned	2022-08-04T17:48:57Z
dc.date.available	2022-08-04T17:48:57Z
dc.date.issued	2022-08-04
dc.identifier.uri	http://hdl.handle.net/10222/81773
dc.description	The work presented looks to gain insight into notions of complexity and difficulty for a task by conducting an exercise in predicting the degree of difficulty a population of models may have on arbitrary classification tasks for a toy dataset. In establishing the differing model evaluation results from these tasks, an argument is made for a label-free means to evaluate models. Methods for evaluating learning without labels such as a clustering-based metrics and entropy are examined. Entropy was found to be the most effective measure to evaluate learning, but issues pertaining to learning methodology and early learning instability require further study.	en_US
dc.description.abstract	In this work, we explore the viability of proposed label-free metrics to evaluate models. We begin by examining the effect on linear probe accuracy which different viable label schemes on an identical dataset may cause. We show that in a toy setting, a notion of “complexity” for distinguishing classes can have predictive capabilities for anticipating relative “difficulty” a population of models may encounter for a comparison between classification tasks. In establishing these arbitrary relative differences in valid formulations for an evaluation task, we justify the search for a label scheme independent means to evaluate learning. To this end, we examine label-free clustering-based metrics and entropy on representational spaces at progressive milestones during self-supervised learning and on pre-trained representational spaces. While clustering-based metrics show mixed success, entropy may be viable for monitoring learning and cross-architectural comparisons, despite displaying instability in early training and showing differing trends for certain learning methodologies.	en_US
dc.language.iso	en	en_US
dc.subject	Machine Learning	en_US
dc.subject	Self-Supervised Learning	en_US
dc.subject	Clustering	en_US
dc.subject	Complexity	en_US
dc.subject	Information Theory	en_US
dc.title	Towards a Label-Free and Representation-Based Metric for Evaluating Machine Learning Models	en_US
dc.date.defence	2022-07-20
dc.contributor.department	Faculty of Computer Science	en_US
dc.contributor.degree	Master of Computer Science	en_US
dc.contributor.external-examiner	n/a	en_US
dc.contributor.graduate-coordinator	Dr. Michael McAllister	en_US
dc.contributor.thesis-reader	Dr. Malcolm Heywood	en_US
dc.contributor.thesis-reader	Dr. Sageev Oore	en_US
dc.contributor.thesis-supervisor	Dr. Thomas Trappenberg	en_US
dc.contributor.ethics-approval	Not Applicable	en_US
dc.contributor.manuscripts	Not Applicable	en_US
dc.contributor.copyright-release	Not Applicable	en_US

Find Full text

Files in this item

Name:: IsaacXu2022.pdf
Size:: 1.980Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Faculty of Graduate Studies Online Theses

Show simple item record