Colloquium | Page 11 | Computer Science Department at Princeton University

Computational Information Design

Date and Time

Monday, September 19, 2005 - 4:00pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

Ben Fry

Host

Adam Finkelstein

Website

http://benfry.com

The ability to collect, store, and manage data is increasing quickly, yet our ability to understand it remains constant. In an attempt to gain better understanding of data, fields such as information visualization, data mining and graphic design are employed, each solving an isolated part of the specific problem, but failing in a broader sense: there are too many unsolved problems in the visualization of complex data. As a solution, I propose that the individual fields be brought together as part of a single process which I call Computational Information Design. Bio: Ben Fry received a doctoral degree at the MIT Media Laboratory, where his research focused on methods of visualizing large amounts of data from dynamic information sources. His current research involves the visualization of genetic data at the Eli & Edyth Broad Insitute of MIT & Harvard. His work has been shown at the Whitney Biennial in 2002 and the Cooper Hewitt Design Triennial in 2003. Other work has appeared in the Museum of Modern Art in New York, at Ars Electronica in Linz, Austria and in the films "Minority Report" and "The Hulk". With Casey Reas of UCLA, he is currently developing Processing, an environment for teaching computational design and sketching interactive media software.

Understanding Network Dynamics: The Race is On

Date and Time

Tuesday, August 16, 2005 - 2:00pm to 3:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

Professor Chen-Nee Chuah, from UC Davis

Host

Jennifer Rexford

While the Internet plays an important role in our day-to-day lives, the global routing infrastructure is surprisingly fragile and the network dynamics of the Internet are not well understood. The RUBINET (Robust & Ubiquitous Networking) research group strives to leverage Internet measurement to characterize the interactions of various network components and across multiple layers. This talk will first give a high-level summary of our finding in terms of Internet routing dynamics. Then we will revisit the design decision of overlay networks to allow routing control at the application layer.

Given the increasing popularity of overlay networks, it is critical to ask the question: can overlay networks and underlying IP networks form a synergistic co-existence? Overlay networks attempt to take control over routing to achieve better performance in the presence of network failures or congestions. ISPs do the same by employing common traffic engineering techniques such as link weight settings, load balancing and routing policies. In this talk, I will illustrate how an uncoordinated effort to the two layers to recover from failures may cause performance degradation and unpredictable behavior from an ISP's view. We also show how current traffic engineering techniques are inadequate to deal with emerging overlay network services.

Multiple similar or dissimilar overlay networks making independent routing decisions could also experience race conditions, resulting in oscillations in both route selection and network load. I will present our analytical model for predicting the synchronization probability of two overlays and shows that it is non-negligible across a wide range of parameter settings, thus implying that the ill-effects of synchronization should not be ignored. The model is validated through simulations that are designed to capture the transient routing behavior of both the IP- and overlay-layers. The effects of factors such as path diversity (measured in round trip times) and probing aggressiveness on these race conditions will be discussed. Finally, I will discuss the implications of our study on the design of overlay networks and the choice of their path probling parameters.

Dimes: A distributed internet measurement infrastructure: rational, architecture, and results

Date and Time

Monday, August 15, 2005 - 2:00pm to 3:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

Yuval Shavitt and Eran Shir, from Tel Aviv University

Host

Jennifer Rexford

Due to the Internet structure and routing the only way to map its topology is by having measurement presence in almost every corner of the Internet. Managing thousands of measurement boxes is impractical, thus, we sugget instead a light-weight software measurement agent to be downloaded by volunteers around the world. The DIMES agent can be executed on every PC (and in the future even smaller devices, like PDAs) and enables us to map the Internet and track its evolution in time in several levels of granularity from the fine router level to the coarse Autonomous System (AS)level. Currently, DIMES has over 4300 installations in the 75 nations around the world, which produce over 5 million measurements a day.

The talk will describe the rational behind DIMES, and explore some of our results, at the AS, router, and IP level. We will also describe our vision to build an Internet evolution model that takes into account external stimuli like economic growth, and show some example of the strong connection between the Internet structure and world economics.

Operations and Management of IP Networks: What Researchers Should Know

Date and Time

Thursday, August 11, 2005 - 1:30pm to 3:00pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

Aman Shaikh and Albert Greenberg, from AT&T Research

Host

Jennifer Rexford

This tutorial will shed light on the art and the science of managing big IP networks, such as tier 1 ISPs. It consists of four parts: (i) architecture of a typical tier-1 service provider network, and operational challenges associated with running such networks; (ii) detailed description of various management and operational tasks; (iii) a case study of supporting an emerging application, VoIP; and (iv) research directions.

We describe the architecture of typical tier-1 service provider networks, in terms of scale and diversity of equipment, protocols and services, and show how these factors make management and operations of these networks a challenging task. We then discuss the key tasks of network operations: customer provisioning and care, network care, traffic engineering, network architecture and design, and security. We describe each task in detail, and how research and innovation in systems and tools fit in. We then consider the challenges of introducing new applications with VoIP as an example. Finally, we discuss challenges and areas where further research and innovation are needed, including automation of operational tasks, managing the network as a whole rather than on a box by box basis, gauging the impact of network activities on customer performance, and scheduling of maintenance activities in a seamless fashion.

Biographies

Aman Shaikh is a Technical Specialist at AT&T Labs (Research). He has published several papers on routing and network management. He has been responsible for design, development and implementation of an OSPF monitor which is deployed in enterprise and service provider networks. He is also actively involed in various activities related to innovation and realization of route monitoring and network management. Recently he spent a month at AT&T Network Operations Center (NOC) talking to several people who work "on the floor" day in day out, and watching the network operations from close quarters. This sort of interaction with Operations with a Research hat on will come through in many of the examples presented at the tutorial.

Albert Greenberg heads the Network Measurement and Engineering Dept. at AT&T Labs-Research, a group which conducts research in networking, with an emphasis on large scale IP networks and systems, and emerging Internet technologies. His research interests include Internet traffic measurement, modeling and engineering, network management, and optical networking. In collaboration with several others at AT&T Labs, he is developing a unified toolkit to manage IP networks. In recent years, he has also worked with Aman Shaikh and others at AT&T on the monitoring and management of large IP routing infrastructures. Albert has had his hands in AT&T's operational IP networks since their birth, with a focus on research on analysis, algorithms, and tools for operational network management.

Event (no name)

Date and Time

Tuesday, June 14, 2005 - 4:00pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

TBD

Host

Virginia Hogan

End-to-End Estimation of the Available Bandwidth Variation Range

Date and Time

Tuesday, June 14, 2005 - 1:30pm to 3:00pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

Constantine Dovrolis, from Georgia Tech

Host

Virginia Hogan

The available bandwidth (avail-bw) of a network path is an important performance metric and its end-to-end estimation has recently received significant attention. Previous work focused on the estimation of the average avail-bw, ignoring the significant variability of this metric in different time scales.

In this paper, we show how to estimate a given percentile of the avail-bw distribution at a user-specified time scale. If two estimated percentiles cover the bulk of the distribution (say 10% to 90%), the user can obtain a practical estimate for the avail-bw variation range. We present two estimation techniques. The first is iterative and non-parametric, meaning that it is more appropriate for very short time scales (typically less than 100ms), or in bottlenecks with limited flow multiplexing (where the avail-bw distribution may be non-Gaussian). The second technique is parametric, because it assumes that the avail-bw follows the Gaussian distribution, and it can produce an estimate faster because it is not iterative. The two techniques have been implemented in a measurement tool called Pathvar. Pathvar can track the avail-bw variation range within 10-20%, even under non-stationary conditions. Finally, we identify four factors that play a crucial role in the variation range of the avail-bw: traffic load, number of competing flows, rate of competing flows, and of course the measurement time scale.

Metric Geometry and Computer Science

Date and Time

Monday, May 2, 2005 - 4:00pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

James Lee, from UC Berkeley

Host

Moses Charikar

The combinatorial landscape is fraught with complexity. Computing optimal solutions to natural problems is often intractable, and understanding discrete systems can be difficult without the presence of richer structures to guide our intuition. When we are able to geometrize a system or to realize our combinatorial structures as part of a larger geometry, there is the potential to greatly increase our depth of understanding. This allows us to employ powerful tools and intuitions developed in fields like non-linear functional analysis, Riemannian geometry, and geometric group theory.

This talk focuses on two examples. The first involves low-distortion embeddings of finite metric spaces, and its application to algorithms and structural theorems for cuts, flows, and balanced separators in graphs. The second revolves around abstract notions of dimension and how they can be used to characterize the complexity of certain problems on metric spaces and networks. In each case, I will discuss how the techniques developed in solving computational problems have not only borrowed from known geometric tools, but have contributed back to those communities as well.

The talk is intended for a general CS audience.

Competitive Collaborative Learning

Date and Time

Wednesday, April 27, 2005 - 4:00pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

Baruch Awerbuch, from Johns Hopkins

Host

Jennifer Rexford

Intuitively, it is clear that trust or shared taste enables a community of users to be more productive, as it allows cooperative learning and avoiding unnecessary repetition of mistakes. In this paper we develop algorithms for the users in such a community to make decisions about selecting products or resources, in a model characterized by two key features:

The quality of the products or resources may vary over time.

Some of the users in the system may be dishonest, manipulating their actions in a Byzantine manner to achieve other goals.

Such algorithms are a fundamental primitive required by applications such as reputation systems in e-commerce, personalized web search, locating resources in peer-to-peer networks, or spam filtering. Note that in all of these applications, it is essential to deal with resources whose quality changes over time (e.g. a seller on eBay may perform perfectly in some transactions but may delay or withhold delivery in others) and to deal with the presence of dishonest agents in the system (again using the eBay example, a seller may cooperate with a ``shill'' whose objective is to boost that seller's reputation).

Our problem can be viewed as unification or generalization of a number of different research frameworks, including -- Online learning theory, specifically the multi-armed bandit problem -- Collaborative filtering and recommendation

We provide algorithms that enable honest users with similar taste to dynamically adapt their choices, so that in the aggregate, their performance nearly matches that of the single best static action.

The main result in this paper is a new algorithm for trust and collaborative filtering, whose convergence time (defined, roughly, as the number of steps required to become constant-competitive) is polylogarithmic in the problem size. Previous methods suffered from a linear convergence time.

Joint work with R. Kleinberg

Improving the Reliability of Commodity Operating Systems

Date and Time

Tuesday, April 26, 2005 - 4:00pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

Michael Swift, from Washington

Host

Kai Li

Despite decades of research in fault tolerance, commodity operating systems, such as Windows and Linux, continue to crash. In this talk, I will describe a new reliability subsystem for operating systems that prevents the most common cause of crashes, device driver failures, without requiring changes to drivers themselves. To date, the subsystem has been used in Linux to prevent system crashes in the presence of driver failures, recover failed drivers transparently to the OS and applications, and update drivers "on the fly" without requiring a system reboot after installation. Measurements show that the system is extremely effective at protecting the OS from driver failures, while imposing little runtime overhead.

Perceptual Data Mining

Date and Time

Thursday, April 21, 2005 - 4:00pm to 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Type

Colloquium

Speaker

Chris Stauffer, from MIT CSAIL

Host

Szymon Rusinkiewicz

Imagine if computers possessed the ability of small children to observe and comprehend their dynamic environment. Very young children can track moving objects, differentiate them, identify them, understand their activity, and understand their interactions with static objects and with other moving objects in the world. There are amazing benefits in coupling these basic human abilities with the unique capabilities of computers: to communicate instantly with high bandwidth; to store and index massive quantities of observations with perfect recall; to process in parallel; and to draw inferences over extremely large stores of data. These benefits have driven the increased interest in automated computer vision applications, such as intelligent visual surveillance, automated traffic analysis, quantitative experimental animal observation, and wide-area scene understanding enabling high-level computer vision research that are predicated on the ability to detect and localize certain types of objects.

This talk describes my work in Perceptual Data Mining (PDM), a bottom-up data-driven framework for bootstrapping visual intelligence in novel environments. This work is focused on developing computational analogs for basic human perception and exploiting the strengths of computers to take full advantage of these capabilities. This research has centered on the development of systems that are capable of: automatically tracking multiple objects in real-time across multiple overlapping and non-overlapping cameras in unstructured indoor and outdoor environments; automatically modeling the types of objects in a particular environment; automatically modeling the activities that these objects perform; learning patterns of the activities over extended periods of time; and detecting unusual objects or behavior. Even without supervision, this system can create a compact description of the objects and activities in an environment that enables effective query retrieval. With minimal supervision, this system can communicate and summarize the activity in an environment. More information: http://www.csail.mit.edu/~stauffer/