Show simple item record

dc.contributor.authorRezaeipourfarsangi, Sima
dc.date.accessioned2023-12-15T16:01:58Z
dc.date.available2023-12-15T16:01:58Z
dc.date.issued2023-12-13
dc.identifier.urihttp://hdl.handle.net/10222/83275
dc.description.abstractDeep language models have become increasingly prominent in the field of machine learning. This thesis explores the potential of deep language models for text representation and their role in specific text mining applications such as interactive document clustering, sales forecasting, and document ranking. First, in interactive document clustering, we leverage deep language models and present a novel system that replaces key-term-based clustering with deep language models, allowing users to steer the clustering algorithm based on their domain knowledge through the system. Second, we introduced a novel approach for improving new product sales forecasting by incorporating product descriptions as an additional feature. By clustering products based on description similarity and using time series data from similar products, demand prediction is enhanced. Deep language models are utilized, along with dimensionality reduction methods. Cluster descriptions are obtained using Top2Vec, and new product forecasts are made based on historical sales data of related clusters and previously introduced products. Third, in document ranking, we proposed a novel approach for ranking resumes based on their similarity to specific job descriptions. By employing Siamese neural networks with integrated components like CNN, LSTM, and attention layers, the model captures sequential, local, and global patterns to extract features and represent the documents. Deep language models are employed to encode the documents, serving as input for the network. Utilizing deep language models, the model achieves improved accuracy in document ranking and enhances the matching process between job descriptions and resumes, surpassing other comparative models. The versatility of deep language models arises from their ability to learn from vast amounts of text data, allowing them to extract meaningful patterns and insights. In our research, we utilized state-of-the-art deep language models such as SBERT, RoBERTa, Universal sentence encoder, Infer-Sent, and BigBird.en_US
dc.language.isoenen_US
dc.subjectDocument Representationen_US
dc.titleDeep Language Models for Text Representation in Document Clustering and Rankingen_US
dc.date.defence2023-11-29
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeDoctor of Philosophyen_US
dc.contributor.external-examinerPawan J. Lingrasen_US
dc.contributor.thesis-readerVlado Keseljen_US
dc.contributor.thesis-readerEhsan Sherkaten_US
dc.contributor.thesis-readerFernando Paulovichen_US
dc.contributor.thesis-supervisorEvangelos E. Miliosen_US
dc.contributor.ethics-approvalReceiveden_US
dc.contributor.manuscriptsYesen_US
dc.contributor.copyright-releaseYesen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record