A Universal Calculus for Stream Processing Languages
Date and Time
Tuesday, May 18, 2010 - 11:00am to 12:00pm
Location
Computer Science 302
Type
Talk
Speaker
Robert Soule, from New York University
Stream processing applications such as algorithmic trading, MPEG processing, and web content analysis are ubiquitous and essential to business and entertainment. Language designers have developed numerous domain-specific languages that are both tailored to the needs of their applications, and optimized for performance on their particular target platforms. Unfortunately, the goals of generality and performance are frequently at odds, and prior work on the formal semantics of stream processing languages does not capture the details necessary for reasoning about implementations. This talk presents Brooklet, a core calculus for stream processing that allows us to reason about how to map languages to platforms and how to optimize stream programs. We
translate from three representative languages, CQL, StreamIt, and Sawzall, to Brooklet, and show that the translations are correct. We formalize three popular and vital optimizations, data-parallel computation, operator fusion, and operator re-ordering, and show under which conditions they are correct. Language designers can use Brooklet to specify exactly how new features or languages behave. Language implementors can use Brooklet to show exactly under which circumstances new optimizations are correct. In on-going work, we are
developing an intermediate language for streaming that is based on Brooklet. We are implementing our intermediate language on System S, IBM's high-performance streaming middleware.
This work is based upon the paper "A Universal Calculus for Stream Processing Languages" by Robert Soule, Martin Hirzel, Robert Grimm, Buğra Gedik, Henrique Andrade, Vibhore Kumar, and Kun-Lung Wu, in the Proceedings of ESOP 2010.
Robert Soule is a Ph.D. candidate at New York University, working with Professor Robert Grimm, and a research co-op in the Data Intensive Systems and Analytics Group at IBM T. J. Watson Research Center, under the guidance of Martin Hirzel. His research interests are in distributed systems, and language support for building systems.