A HIERARCHICAL STRUCTURED MACHINE-LEARNING METHOD FOR LARGE-SCALE MULTI-CLASS PROBLEMS
Abstract
When a clinician diagnoses a patient, they do so by choosing one from many possible diagnoses. This is a laborious process, one that requires input from many different sources of information. It would be useful to have an objective tool to give a prediction of a patient’s diagnosis using readily available clinical information.\\
Although this would be useful, one needs to still choose from many different possible choices, a large scale multi-class problem that conventional classification methods may not be suited to solve. We describe a method that assigns a class label to an observation from a large number of class possible labels, and gives the probability of said observation having such. The method uses a combination of support vector machines, and an agglomerative hierarchical clustering algorithm to perform the task. We display the performance of the method on a benchmark problem, and a hospital-based dataset from Halifax, NS.