[Colloq] Thesis Defense: Stefan Savev 9/7/12 at 2:00pm in 366 WVH

Nicole Bekerian nicoleb at ccs.neu.edu
Tue Sep 4 09:50:00 EDT 2012



The College of Computer and Information Science presents:

Date: Friday September 7th, 2012
Time: 2:00 PM
Place: West Village H 366

Thesis Defense
Title: Collection Construction Methodologies for Learning to Rank
Speaker: Stefan Savev
Advisor: Jay Aslam

Abstract:
Ranking documents in response to user queries is one of the fundamental problems in Information Retrieval. Learning to Rank has emerged as an effective approach for data-driven construction of ranking algorithms. Although many algorithms have been created, the effect of the properties of the training data through which such algorithms are developed has not been systematically studied. The creation of a learning dataset requires great effort since every document in the dataset has to be manually labeled with its degree of relevance. Motivated by the importance of efficiently creating quality datasets, we study 1) the effect of characteristics of the training dataset on algorithm quality and 2) theoretically founded methods for construction of training datasets. With regard to the first topic, we establish through a number of controlled experiments that properties such as the distribution of documents across relevance grades and distances between relevance categories are useful predictors of dataset quality. Second, drawing on the statistical theory of Optimal Design of Experiments we provide a theoretical foundation for the identified characteristics in training set selection criteria. Its underlying intuition is that one should simultaneously maximize the diversity between feature vectors and the representation of relevant documents.



-- 




Best, 
Nicole 

______________________________________________________________ 

Nicole Bekerian 
Administrative Assistant 

Northeastern University 
College of Computer and Information Science 
360 Huntington Ave. 
202 West Village H 
Boston, MA 02115 

Phone: 617.373.2462 
Fax: 617.373.5121 






More information about the Colloq mailing list