[Colloq] Hiring Talk - Jennie Duggan - Managing Arrays for Science Applications at Scale - March 31, 10:30am, 366

Jessica Biron bironje at ccs.neu.edu
Tue Mar 25 13:34:35 EDT 2014



Jennie Duggan 




Monday, March 31st, 2014 

10:30am 

366 WVH 





"Managing Arrays for Science Applications at Scale" 



Science applications are becoming increasingly data-driven. Researchers are collecting new data at an unprecedented scale, and much of it is stored in multidimensional arrays. Such workloads consist of complex transformations, many of which query the data spatially. The established relational model of data management cannot support this new class of applications. At the same time, scientists are increasingly conducting their experiments on large, shared-nothing clusters in lieu of purpose built platforms. As a result, processor time is becoming more plentiful and network bandwidth is the scarcer resource. 



In this talk, I will describe my research on efficiently distributing arrays for scientific workloads. This work is done in the context of SciDB, an open source array database system built for applications with complex analytics. I will first present our optimization of data-intensive queries to minimize their use of network resources. Our approach uses analytical cost modeling to assign segments of a distributed query to individual database nodes. The second part of my talk will present research on data placement for elastic array databases. This partitioning minimizes the time needed to reorganize the database for a change in the hardware configuration, while optimizing the layout of multidimensional data structures for spatial queries. 





Bio: 
Jennie Duggan is a postdoctoral associate at the Massachusetts Institute of Technology where she works with Michael Stonebraker. She received her Ph.D. from Brown University in December 2012 under the guidance of Ugur Cetintemel. Her research interests include scientificdata management, database workload modeling, and cloud computing. She is especially focused on making data-driven science applications fast and scalable. In her spare time, she enjoys cooking, sailing, and traveling. 


More information about the Colloq mailing list