[Colloq] Shahzad Rajput Paper Requirement Fulfillment Presentation

panarese at ccs.neu.edu panarese at ccs.neu.edu
Mon Jun 13 08:29:25 EDT 2011





The College of Computer and Information Science presents: 


Paper Requirement Fulfillment Presentation 


Speaker: Shahzad Rajput 


Date: June 20, 2011 (Monday) 
Time: 3:30 p.m. 
Where: WVH 366 



Title: A Nugget-based Test Collection Construction Paradigm 





Abstract: 


The problem of building test collections is central to the development of information retrieval systems such as search engines. The primary use of test collections is the evaluation of IR systems. The widely employed "Cranfield paradigm" dictates that the information relevant to a topic be encoded at the level of documents, therefore requiring effectively complete document relevance assessments. As this is no longer practical for modern corpora, numerous problems arise, including scalability, reusability, and applicability. 


We propose a new method for relevance assessment based on relevant information, not relevant documents. Once the relevant information is collected, any document can be assessed for relevance, and any retrieved list of documents can be assessed for performance. Starting with a few relevant "nuggets" of information manually extracted from existing TREC corpora, we implement and test a method that finds and correctly assesses the vast majority of relevant documents found by TREC assessors, as well as up to four times more additional relevant documents. We then show how these inferred relevance assessments can be used to perform IR system evaluation. Our main contribution is a methodology for producing test collections that are highly accurate, more complete, scalable, reusable, and can be generated with similar amounts of effort as existing methods, with great potential for future applications. 


Advisor: Prof. Javed Aslam 


PhD committee representatives: Prof. Rajmohan Rajaraman and Prof. Timothy Bickmore 




More information about the Colloq mailing list