[Colloq] Thesis Proposal presentation
Lynne Panarese
panarese at ccs.neu.edu
Tue Jun 21 11:01:04 EDT 2011
The College of Computer and Information Science presents:
Thesis Proposal Presentation:
Speaker: Stefan Savev
Date: June 22, 2011 (Wednesday) at 9:00am
Where: WVH 366
Thesis title:
"Collection Construction Methodologies for Learning to Rank"
Abstract:
"Ranking documents in response to user queries is one of the
fundamental problems in Information Retrieval. Learning to Rank has
emerged as an effective approach for data driven construction of ranking
algorithms. Although many algorithms have been created, the effect of
the properties of the training data through which such algorithms are
developed has not been systematically studied. The creation of a learning
dataset requires great effort since every document in the dataset has
to be manually labeled with its degree of relevance. Motivated by the
importance of the problem of efficiently creating quality datasets, I pro-
pose to study 1) the effect of characteristics of the training dataset on
algorithm quality and 2) theoretically founded methods for construction
of training datasets. I achieve the first goal through a number of con-
trolled experiments from which I establish that properties such as the
distribution of documents across relevance grades and distances between
relevance categories are useful predictors of dataset quality. I achieve
the second goal by modeling the identified characteristics in training set
selection criteria. The criteria are theoretically founded in the statistical
theory of Optimal Design of Experiments and carry the intuition of
simultaneously maximizing the diversity between features, queries and
estimated relevance grades"
Thesis committee:
Javed Aslam (advisor)
Mirek Riedewald
Ravi Sundaram
Ben Cartrettee (external member, University of Delaware)
More information about the Colloq
mailing list