[Colloq] Thesis Proposal presentation

Lynne Panarese panarese at ccs.neu.edu
Tue Jun 21 11:01:04 EDT 2011







The College of Computer and Information Science presents: 


Thesis Proposal Presentation: 

Speaker: Stefan Savev 

Date: June 22, 2011 (Wednesday) at 9:00am 
Where: WVH 366 

Thesis title: 
"Collection Construction Methodologies for Learning to Rank" 

Abstract: 
"Ranking documents in response to user queries is one of the 
fundamental problems in Information Retrieval. Learning to Rank has 
emerged as an effective approach for data driven construction of ranking 
algorithms. Although many algorithms have been created, the effect of 
the properties of the training data through which such algorithms are 
developed has not been systematically studied. The creation of a learning 
dataset requires great effort since every document in the dataset has 
to be manually labeled with its degree of relevance. Motivated by the 
importance of the problem of efficiently creating quality datasets, I pro- 
pose to study 1) the effect of characteristics of the training dataset on 
algorithm quality and 2) theoretically founded methods for construction 
of training datasets. I achieve the first goal through a number of con- 
trolled experiments from which I establish that properties such as the 
distribution of documents across relevance grades and distances between 
relevance categories are useful predictors of dataset quality. I achieve 
the second goal by modeling the identified characteristics in training set 
selection criteria. The criteria are theoretically founded in the statistical 
theory of Optimal Design of Experiments and carry the intuition of 
simultaneously maximizing the diversity between features, queries and 
estimated relevance grades" 

Thesis committee: 

Javed Aslam (advisor) 
Mirek Riedewald 
Ravi Sundaram 
Ben Cartrettee (external member, University of Delaware) 



More information about the Colloq mailing list