[Colloq] PhD Thesis Defense - Emine Yilmaz - Tuesday, Nov. 6

Fri Nov 2 09:21:04 EDT 2007

College of Computer and Information Science
Presents

PhD Thesis Defense by:
Emine Yilmaz

title:
Informative and Efficient Evaluation of Retrieval Systems

Tuesday, November 6, 2007
11:30am
366 West Village H

Abstract
We consider the problem of evaluating the quality of information retrieval systems such as search engines. In a typical search engine development cycle, (1) a retrieval algorithm for returning relevant documents in response to a user query is developed, (2) the retrieval algorithm is tested over a large set of representative queries, and (3) the quality of the retrieval algorithm is assessed with respect to one or more performance metrics. This cycle is continually repeated in order to optimize (tune) the performance of the retrieval algorithm with respect to the performance metric(s). 

Much research has been devoted to the development of accurate retrieval algorithms, but relatively little has been devoted to the efficient and effective evaluation of such retrieval algorithms: many popular performance metrics are ad hoc, and many popular evaluation paradigms are effectively brute force. Optimizing a retrieval algorithm with respect to a poor metric will yield poor performance, and brute force evaluation paradigms are exceedingly expensive and impractical on a
large scale.

We present models for the efficient and effective evaluation of information retrieval systems. In the first part of this thesis, we consider the problem of analyzing the quality of various measures of retrieval performance and we describe a model based on the maximum entropy method for assessing the quality and utility of any performance metric.

In the second part of this thesis, we consider efficient evaluation of retrieval systems on a large scale. First, we describe methods based on statistical sampling theory for efficiently assessing the performance of any retrieval algorithm with respect to a performance metric. We then describe a method that can be used to efficiently and accurately infer a large judged set of documents from a relatively small number of judged documents, thus permitting accurate and efficient evaluation on a large scale. 

Our results on large test collections demonstrate that (1) some performance metrics are quantifiably much more informative than others and (2) accurate retrieval evaluation can be performed at as low as 5% of the cost of standard methodologies. 

Committee : 
Javed Aslam (Advisor) 
Harriet Fell 
Stephen Robertson (Microsoft Research) 
Ravi Sundaram