Due to advances in semiconductor and communications technology, we are
steadily moving towards a world filled with a plethora of embedded
sensing devices as well as "virtual" sensors within devices -- with
the potential to monitor the environment and to react to or even
anticipate changes. The environment being monitored might be a smart
hospital, an entire campus, or even an office or data center. Often
the behaviors we wish to produce involve combining constantly updating
(streaming) data from multiple heterogeneous sensor and Internet
sources. A major challenge is how to manage the resulting complexity,
and to build large-scale, rich monitoring, learning, and control
applications.
In the ASPEN project we have been exploring an approach that seeks to
separate logical dataflow from most of the algorithmic logic -- using
a declarative, SQL-like (but iterative and incremental) programming
model to capture the dataflow, data transformation, and state
management needed by an application, combined with small bits of
procedural code to handle complex logic. Our platform provides
distributed query optimization that takes runtime conditions into
account, while also supporting a range of learning, prediction, and
connection-finding algorithms. In this talk I will describe our basic
ASPEN prototype including cluster and sensor subsystems, and provide
an overview of how we address issues of query optimization,
distributed query execution, and incremental recomputation.
Work done jointly with Mengmeng Liu, Svilen Mihaylov, Boon Thau Loo,
and Sudipto Guha
Zachary Ives is an Associate Professor and the Markowitz Faculty
Fellow at the University of Pennsylvania. His research interests
include data integration and sharing, "big data", sensor networks, and
data provenance and authoritativeness. He is a recipient of the NSF
CAREER award, and an alumnus of the DARPA Computer Science Study Panel
and Information Science and Technology advisory panel. He has also
been awarded the Christian R. and Mary F. Lindback Foundation Award
for Distinguished Teaching. He serves as the undergraduate curriculum
chair for Penn's Singh Program in Market and Social Systems
Engineering. He is a co-author of the textbook Principles of Data
Integration.