[Colloq] Mirek Riedewald hiring Talk

Patricia Freeman tricia at ccs.neu.edu
Fri Mar 21 15:57:30 EDT 2008


Mirek Riedewald will be joining us on Monday, 3/24 for a Hiring Visit.
His Talk will be at 10:30am in Room 366WVH.

Scalable Data Stream Processing

Speaker: Mirek Riedewald 

Date: Monday, March 24, 2008 

Talk: 10:30 a.m. - 11:30 a.m., 366 WVH

Abstract

Massive streams of data occur in many contexts: as sensor readings for natural or industrial processes, RFID readings in supply chain management, financial transactions, events generated by computing system monitoring infrastructure, and as RSS feeds on the Web. Users want to monitor such streams and compute complex queries in (near) realtime, e.g., to discover unusual system behavior or non-compliant financial transactions. Compared to traditional databases, the role of queries and data is reversed in the sense that queries are continuously active and are producing new results as data is streaming in. A data stream processing system's performance has to scale not only with the input stream rates, but also with the number of registered queries. Effective multi-query optimization techniques are crucial for achieving scalability. 

In this talk, I will present a complete solution for scalable data stream processing. My approach is based on a novel language for formulating powerful data stream queries. This language combines features of relational algebra and regular expressions. I will show how queries written in this language can be translated into non-deterministic automata and how a large number of such automata can be processed efficiently as data is streaming in. The system implementing these ideas, Cayuga, achieves a throughput of tens of thousands of data items per second on a commodity PC, even when processing thousands of continuously active monitoring queries.

Brief Biography

Mirek Riedewald is a Research Associate in Cornell University's Computer Science Department. He received his Ph.D. in 2002 from the University of California, Santa Barbara. Dr. Riedewald's research interests are in the general area of databases and information systems. He is developing techniques for scalable data stream processing and new data mining and analysis approaches for data-driven science. He also worked as a Visiting Researcher in Microsoft Research, Redmond, for several months in fall 2006 and summer 2007, contributing his expertise in data stream processing.



More information about the Colloq mailing list