Quick links

Talk

Datacenter Network Architecture @ Yahoo!: Past, present, and challenges for the future

Date and Time
Monday, May 2, 2011 - 1:00pm to 2:00pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Speaker
Igor Gashinsky, from Yahoo!
Host
Michael Freedman
As the number of people and applications on the internet grows exponentially, so do the complexities of serving vast amounts of information. From the progression to unlimited e-mail storage, 10+ MegaPixel photos and HD streaming video being stored and shared online, to semi-real-time page personalization and real time search applications, the storage and computing power requirements have grown by leaps and bounds over the past 15 years.

Ever wonder what the datacenter networks supporting these applications look like? This talk will provide a basic overview of large-scale datacenter network design principles, stepping through a history of how those designs evolved at Yahoo! over the years, the challenges that were faced, their solutions, as well as the challenges we anticipate in the (near) future.

Igor Gashinsky is a principal architect at Yahoo!, a global content provider, where he is involved in projects ranging from overall network design (including highly resilient switching and routing architecture, peering, MPLS, L4-7 load balancing), as well as scalable content delivery methodologies and DNS architecture. Prior to his 8.5 years with Yahoo!, Igor worked as a Senior Systems & Network Engineer for HotJobs.com, as well as consulted for a number of clients, working on network and systems architecture, network security, system clustering/HA, high-performance storage solutions, and general Unix system and network administration. Igor holds a BS in Computer Science from Stevens Institute of Technology.

Why JavaScript Matters

Date and Time
Tuesday, April 26, 2011 - 4:30pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Talk
Speaker
Douglas Crockford, from Yahoo! Inc.
JavaScript has been shunned by the academic community because of its many obvious shortcomings. But it includes the best features of Scheme and Self, making it a language deserving of study and research. It was the first lambda language to go mainstream. It is a language with amazing ubiquity and expressive power.

Douglas Crockford is a senior JavaScript Architect at Yahoo! He is well known for his work in introducing JavaScript Object Notation (JSON). He has also worked on the computerization of media at Atari, Lucasfilm, and Paramount. Crockford was the founder and CEO of Electric Communities from 1993 to 2001, and the founder and CTO of State Software (also known as Veil Networks) from 2001 to 2002. Crockford is the author of JavaScript: The Good Parts ISBN 978-0596517748.

Determinantal Point Processes: Representation, Inference and Learning

Date and Time
Tuesday, March 22, 2011 - 12:00pm to 1:00pm
Location
Computer Science 402
Type
Talk
Determinantal point processes (DPPs) arise in random matrix theory and quantum physics as models of random variables with negative correlations. Among many remarkable properties, they offer tractable algorithms for exact inference, including computing marginals, computing certain conditional probabilities, and sampling. DPPs are a natural model for subset selection problems where diversity is preferred. For example, they can be used to select diverse sets of sentences to form document summaries, or to return relevant but varied text and image search results, or to detect non-overlapping multiple object trajectories in video. I'll present our recent work on a novel factorization and dual representation of DPPs that enables efficient inference for exponentially-sized structured sets. We develop a new inference algorithm based on Newton identities for DPPs conditioned on subset size. We also derive efficient parameter estimation for DPPs from several types of observations. I'll show the advantages of the model on several natural language and vision tasks: extractive document summarization, diversifying image search results and multi-person articulated pose estimation problems in images.

Joint work with Alex Kulesza, University of Pennsylvania

Ben Taskar received his bachelor's and doctoral degree in Computer Science from Stanford University. After a postdoc at the University of California at Berkeley, he joined the faculty at the University of Pennsylvania Computer and Information Science Department in 2007, where he currently co-directs PRiML: Penn Research in Machine Learning. His research interests include machine learning, natural language processing and computer vision. He has been awarded the Sloan Research Fellowship and selected for the Young Investigator Program by the Office of Naval Research and the DARPA Computer Science Study Group. His work on structured prediction has received best paper awards at NIPS and EMNLP conferences.

Learning and Mining in Complex Networks, with Applications to Cyber Situational Awareness

Date and Time
Tuesday, March 8, 2011 - 12:00pm to 1:00pm
Location
Computer Science 302
Type
Talk
Speaker
Tina Eliassi-Rad, from Rutgers University
Complex networks are ubiquitous in many domains. Examples include spatial, technological, informational, social, and biological networks. In this talk, I will present algorithms for both network classification and clustering, paying specific attention to evaluation in such non-IID settings, scalability, transfer learning, and applications to cyber situational awareness.

Tina Eliassi-Rad is an Assistant Professor at the Department of Computer Science at Rutgers University. She is also a member of the Rutgers Center for Computational Biomedicine, Imaging, and Modeling (CBIM) and Rutgers Center for Cognitive Science (RuCCS). Until September 2010, Tina was a Member of Technical Staff at Lawrence Livermore National Laboratory. Tina earned her Ph.D. in Computer Sciences (with a minor in Mathematical Statistics) at the University of Wisconsin-Madison in 2001. Broadly speaking, Tina's research interests include machine learning, data mining, and artificial intelligence. Her work has been applied to the World-Wide Web, text corpora, large-scale scientific simulation data, and complex networks. Tina is an action editor for the Data Mining and Knowledge Discovery Journal. She received a US DOE Office of Science Outstanding Mentor Award in 2010.

Let the Market Drive Deployment: A Strategy for Transitioning to BGP Security

Date and Time
Tuesday, March 29, 2011 - 12:30pm to 1:30pm
Location
Computer Science 402
Type
Talk
With a cryptographic root-of-trust for Internet routing (RPKI) on the horizon, we can finally start planning the deployment of one of the secure interdomain routing protocols proposed over a decade ago (Secure BGP, secure origin BGP). However, if experience with IPv6 is any indicator, this will be no easy task. Security concerns alone seem unlikely to provide sufficient local incentive to drive the deployment process forward. Worse yet, the security benefits provided by the S*BGP protocols do not even kick in until a large number of ASes have deployed them.

Instead, we appeal to ISPs' interest in increasing revenue-generating traffic. We propose a strategy that governments and industry groups can use to harness ISPs' local business objectives and drive global S*BGP deployment. We evaluate our deployment strategy using theoretical analysis and large-scale simulations on empirical data. Our results give evidence that the market dynamics created by our proposal can transition the majority of the Internet to S*BGP.

Joint work with Sharon Goldberg and Michael Schapira

Predicting Faults in Heterogeneous, Federated Distributed Systems

Date and Time
Friday, October 1, 2010 - 2:00pm to 3:00pm
Location
Computer Science 402
Type
Talk
Host
Jennifer Rexford
It is notoriously difficult to make distributed systems reliable. This becomes even harder in the case of the widely-deployed systems that become heterogeneous and federated. The set of routers in charge of the inter-domain routing in the Internet is a prime example of such a system. The unanticipated interaction of nodes under seemingly valid configuration changes and local fault-handling can have a profound effect. For example, the Internet has suffered from multiple IP prefix hijackings, as well as performance and reliability problems due to emergent behavior resulting from a local session reset.

We argue that the key step in making these systems reliable is the need to automatically predict faults. In this talk, I will describe the design and implementation of DiCE, a system that uses temporal and spatial awareness to predict faults in heterogeneous, federated systems. Our live evaluation in the testbed shows that DiCE quickly and successfully predicts two important classes of faults, operator mistakes and programming errors, that have plagued BGP routing in the Internet.

Joint work with Marco Canini, Vojin Jovanovic, and Gautam Kumar

Bio: Dejan Kostić obtained his Ph.D. in Computer Science at the Duke University, under Amin Vahdat. He spent the last two years of his studies and a brief stay as a postdoctoral scholar at the University of California, San Diego. He received his Master of Science degree in Computer Science from the University of Texas at Dallas, and his Bachelor of Science degree in Computer Engineering and Information Technology from the University of Belgrade (ETF), Serbia. In January 2006, he started as a tenure-track assistant professor at the School of Computer and Communications Sciences at EPFL (Ecole Polytechnique Fédérale de Lausanne), Switzerland. In 2010, he received a European Research Council (ERC) Starting Investigator Award. His interests include Distributed Systems, Computer Networks, Operating Systems, and Mobile Computing.

Wide-Area Route Control for Distributed Services

Date and Time
Tuesday, June 15, 2010 - 11:00am to 11:45am
Location
Computer Science 402
Type
Talk
Speaker
Vytautas Valancius, from Georgia Tech
Host
Jennifer Rexford
Many distributed services would benefit from control over the flow of traffic to and from their users, to offer better performance and higher reliability at a reasonable cost. Unfortunately, although today's cloud-computing platforms offer elastic computing and bandwidth resources, they do not give services control over wide-area routing. We propose replacing the data center's border router with a Transit Portal (TP) that gives each service the illusion of direct connectivity to upstream ISPs, without requiring each service to deploy hardware, acquire IP address space, or negotiate contracts with ISPs. Our TP prototype supports many layer-two connectivity mechanisms, amortizes memory and message overhead over multiple services, and protects the rest of the Internet from misconfigured and malicious applications. Our implementation extends and synthesizes open-source software components such as the Linux kernel and the Quagga routing daemon. We also implement a management plane based on the GENI control framework and couple this with our four-site TP deployment and Amazon EC2 facilities.

Vytautas Valancius is a Ph.D candidate at Georgia Institute of Technology, advised by professor Nick Feamster. His research interests include interdomain routing, Internet economics, and network virtualization. Prior to his Ph.D studies, Vytautas obtained M.S. in KTH, Sweden and worked in the networking industry as a consultant for 5 years, earning CCIE#14359 certification.

Alumni- Faculty Forum: New Directions in Socail Media

Date and Time
Friday, May 28, 2010 - 2:30pm to 3:30pm
Location
McCosh Hall 50
Type
Talk
Moderator: Ed Felten, Director, Center for Information Technology and Professor, Computer Science and Public Affairs Panelists: Bruce Campbell '90, President, Corporate Development and Digital Media, Discovery Communications Abe Crystal '00, Co-Founder, MoreBetter Labs Bradford Lyman '05, Manager, Client Services, Medallia LLC Alexander Macgillivray '95, General Council, Twitter Inc.

Programming Parallel Accelerators at the \

Date and Time
Thursday, May 20, 2010 - 3:30pm to 4:30pm
Location
Computer Science 302
Type
Talk
Speaker
Ben Ylvisaker, from University of Washington
Parallel accelerators like FPGAs and GPUs have been shown to provide huge performance and energy efficiency advantages on a wide range of applications, from video coding to scientific simulation to wireless communication. However, accelerators are still harder to program than conventional processors, which is almost certainly limiting their use. I advocate for "C level" programming of accelerators, and in this talk I will describe several projects I have worked on to enable that. Specifically, I will discuss a new approach to pipelining complex loops, probabilistic auto-tuning, and relaxed I/O ordering operational semantics.

Ben Ylvisaker is almost finished being a graduate student at the University of Washington. There he is advised by Carl Ebeling and Scott Hauck, and works in the Mosaic group, which does research on architectures, tools and applications for parallel accelerators. Before coming to UW, he earned a master's degree at Carnegie Mellon, where he also got an undergrad degree a few years earlier. Interspersed between the academic adventures, Ben has worked at a number of startup companies, none of which exist anymore.

Data Aware Scheduling for Multi-threaded Applications on SMP Machines

Date and Time
Tuesday, February 16, 2010 - 12:30pm to 1:30pm
Location
Computer Science 402
Type
Talk
Host
Andrea LaPaugh
Extensive use of multi-threaded applications that run on SMP machines justifies modifications in thread scheduling algorithms to consider threads' characteristics in order to improve performance. Current schedulers (e.g. in Linux, AIX) avoid migrating tasks between CPUs unless absolutely necessary. Unwarranted data cache misses occur when tasks that share data run on different CPUs, or are far apart time-wise on the same CPU. This work presents an extension to the Linux scheduler that exploits inter-task data relations to reduce data cache misses in multi-threaded applications running on SMP platforms, thus improving runtime, memory throughput, and energy consumption. Our approach schedules the tasks to the CPU that holds the relevant data rather than to the one with highest affinity. We observed improvements in CPU time and throughput on several benchmarks. For the Chat benchmark the improvement in CPU time and cache misses is over 30% on average.

Dr. Pinter was a research specialist at MIT before receiving her Ph.D. in computer science from Boston University. Following her Ph. D., she joined the faculty of the Electrical Engineering Department of Technion (Israel), remaining for twelve years before joining the IBM research lab in Haifa as member of research staff. She recently left IBM to become CTO of Rascal Software Security. She is also an adjunct faculty member of the Department of Computer Science, Haifa University, supervising graduate student research.

Follow us: Facebook Twitter Linkedin