FPO | Computer Science Department at Princeton University

Dingli Yu FPO

Date and Time

Thursday, July 18, 2024 - 1:30pm to 3:30pm

Location

Computer Science 402

Type

FPO

Dingli Yu will present his FPO "Efficient Scaling of Large Models: Principles in Optimization and Data Aspects" on Thursday, July 18, 2024 at 1:30 PM in CS 402 and Zoom.

Location: Zoom link: https://princeton.zoom.us/j/7188314894?omn=99115109336

The members of Dingli’s committee are as follows:
Examiners: Sanjeev Arora (Adviser), Elad Hazan, Chi Jin
Readers: Danqi Chen, Mark Braverman

A copy of his thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend his talk.

Abstract follows below:
Deep learning has advanced remarkably in recent decades. Yet, its theoretical foundations, particularly in the realm of large models, still lag behind. This thesis focuses on research that combines strong theoretical foundations with practical applications in efficiently scaling up large models.

In the first part of the thesis, we focus on the training dynamics of neural nets by covering the theory of overparametrized neural nets. We will briefly introduce the theory of Neural Tangent Kernel (NTK), and proceed with Hyperparameter Transfer, an important application of the Tensor Program framework. We cover some of the earliest papers that establish NTK as a research field, along with the limitations of NTK. Hyperparameter Transfer is a novel and efficient paradigm for hyperparameter tuning by providing the optimal strategy for scaling up models. We introduce the characterization of the training dynamics for deep neural nets and offer an efficient hyperparameter selection scheme where optimal hyperparameters selected by tuning on shallow nets also work for deep nets.

In the second part of the thesis, we focus on the data aspect of large model scaling. We will first introduce Skill-Mix, a novel and unique evaluation that sidesteps issues of traditional large language model (LLM) evaluations like data contamination and cramming for leaderboard. Skill-Mix randomly selects k language skills, then prompts the LLM to produce a concise text that demonstrates the chosen skills. The exponentially growing number of skill combinations provably prevent data contamination and can further reveal the novelty of successful answers by powerful LLMs. We then introduce ConceptMix, an extension of Skill-Mix to evaluate the capabilities of text-to-image models to combine k random selected visual concepts. Finally, we uncover the capabilities of LLMs to learn and generalize skill compositions given good responses from Skill-Mix. The results show that a few thousand of such data is enough to significantly improve the model performance in unseen skill combinations, beating models with much larger sizes. It suggests incorporating skill-rich synthetic text into training is an efficient way to scale up the data

Nataly Brukhim FPO

Date and Time

Friday, June 28, 2024 - 11:00am to 1:00pm

Location

Computer Science 402

Type

FPO

Nataly Brukhim will present her FPO "New Directions in Boosting" on Friday, June 28, 2024 at 11:00 AM in CS 402.

Location: CS 402

The members of Nataly’s committee are as follows:
Examiners: Elad Hazan (Adviser), Ryan Adams, Robert Schapire (Microsoft Research)
Readers: Sanjeev Arora, Shay Moran (Technion)

A copy of her thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend her talk.

Abstract follows below:
Boosting is a fundamental methodology in machine learning used to boost the accuracy of weak learning models, transforming them into strong learners. This thesis establishes new directions in boosting theory, presenting algorithms and their analyses for complex learning scenarios that go beyond the realm of traditional classification. These developments extend the benefits of the boosting methodology to modern and challenging learning settings.

This thesis is divided into two main parts: Multiclass Boosting and Boosting for Sequential Decision Making. Part I explores the generalization of classical boosting theory to the multiclass setting, specifically focusing on the challenging case of an extremely large label space. Initially, we present a hardness result that illustrates that boosting does not easily from the binary to the multiclass case. Subsequently, within the broader context of multiclass learning, we develop novel algorithmic strategies which provide a complete characterization of multiclass learnability, resolving a longstanding open problem. Finally, leveraging these novel ideas, we also introduce new boosting techniques that circumvent the aforementioned hardness barrier, leading to efficient multiclass boosting methods.

In Part II, we develop boosting frameworks for sequential decision making. Sequential decision making tasks such as control, bandit and reinforcement learning, can be thought of as challenging generalized variants of statistical learning, which take into account interactive interactions with the environment. As boosting was developed for static datasets, extending the technique to these tasks poses significant challenges, and requires grappling with the effects of feedback and systems that change over time. In this line of work we develop boosting frameworks in various sequential decision making tasks. Namely, the online agnostic learning setting, online control of dynamical systems, the bandit setting in online learning, and reinforcement learning.

Deniz Oktay FPO

Date and Time

Tuesday, June 18, 2024 - 1:00pm to 3:00pm

Location

Computer Science 401

Type

FPO

Deniz Oktay will present his FPO "Translating Between Scientific Computing and Machine Learning with Automatic Differentiation" on Tuesday, June 18, 2024 at 1:00 PM in CS 401.

Location: CS 401

The members of Deniz’ committee are as follows:
Examiners: Ryan Adams (Adviser), Benjamin Eysenbach, Sigrid Adriaenssens
Readers: Elad Hazan, Szymon Rusinkiewicz

A copy of his thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend his talk.

Abstract follows below:
Scientific computing and machine learning, although historically separate fields, have seen much effort in unification as of recent years, especially as machine learning techniques have shown promise in scientific problems. In this thesis, I present work in the intersection of these areas, using automatic differentiation (AD) as the common language between the two. First, I present a methodological advancement in AD: Randomized Automatic Differentiation, a technique to reduce the memory usage of AD, and show that it can provide memory improvements in both machine learning and scientific computing applications.

Next, I focus on mechanical design. I first describe Varmint: A Variational Material Integrator, which is a robust simulator for the statics of large deformation elasticity, using automatic differentiation as a first class citizen. Building this simulator allows us easy interoperability between machine learning and solid mechanics problems, and has been used as in several published and in submission works. I will then describe Neuromechanical Autoencoders, where we coupled neural network controllers with mechanical metamaterials to create artificial mechanical intelligence. The neural network ”encoder” consumes a representation of the task—in this case, achieving a particular deformation—and nonlinearly transforms this into a set of linear actuations which play the role of the latent encoding. These actuations then displace the boundaries of the mechanical metamaterial inducing another nonlinear transformation due to the complex learned geometry of the pores; the resulting deformation corresponds to the ”decoder”.

Susan Tan will present her FPO "Transpilation Utilizing Language-agnostic IR and Interactivity for Parallelization" on 05/30/2024 at 1pm in Friend Center 004.

Date and Time

Thursday, May 30, 2024 - 1:00pm to 3:00pm

Location

Not yet determined.

Type

FPO

Susan Tan will present her FPO "Transpilation Utilizing Language-agnostic IR and Interactivity for Parallelization" on 05/30/2024 at 1pm in Friend Center 004.

Srikar Kasi FPO (CS 301 & Zoom)

Date and Time

Wednesday, August 7, 2024 - 10:00am to 12:00pm

Location

Computer Science 301

Type

FPO

Title and abstract forthcoming....

Examiners: Kyle Jamieson (kylej (@princeton.edu)), Margaret Martonosi (mrm (@princeton.edu)), Yasaman Ghasempour (ghasempour (@princeton.edu))
Readers: Ravi Netravali (rnetravali (@princeton.edu)), John Kaewell (john.kaewell (@interdigital.com), InterDigital, Inc.), Paul Warburton (p.warburton (@ucl.ac.uk), UCL)
Location: CS 301, Zoom (https://princeton.zoom.us/j/5169139659)

Alexander Strzalkowski FPO

Date and Time

Monday, May 6, 2024 - 11:00am to 1:00pm

Location

Computer Science 301

Type

FPO

Alexander Strzalkowski will present his FPO "Inferring the Biological Time of Single Cells using Supervised Dimensionality Reduction and Trees" on Monday, May 6, 2024 at 11:00 AM in CS 301.

Location: CS 301

The members of Alexander’s committee are as follows:
Examiners: Ben Raphael (Adviser), Yuri Pritykin, Michelle Chan
Readers: Ellen Zhong, Li Ding (Washington University in St. Louis)

Everyone is invited to attend his talk.

Abstract follows below:
Single-cell omics measurements have exploded in growth over the past decade. This explosion has allowed researchers to probe human health and biology with unprecedented resolution. Currently, all these types of measurements are destructive, thus they only provide static snapshots of important dynamic biological processes such as development, cancer progression, and cell cycle. As cells differentiate/progress biologically asynchronously in most tissues, a major computational task often called trajectory inference is to infer the latent biological time also known as pseudotime of every cell. This inverse problem in general is quite challenging and is further complicated by the fact that single-cell omics measurements like scRNA-seq and scATAC-seq are highly sparse and highdimensional. Much of my work has shown that simple linear supervised dimensionality reduction techniques that rely on cell type information can outperform complex non-linear dimensionality reduction techniques when used in conjunction with state-of-the-art trajectory inference methods in a large benchmark. Moreover, we investigate the difficulties of benchmarking trajectory inference methods in the absence of ground truth showcasing that the implicit goal of many methods is not to identify intermediate/transient cell types but rather order cell types. In addition, we introduce a novel supervised linear dimensionality reduction technique called BCA that when applied to simulated and real datasets is better able to uncover intermediate cell types. Lastly, we have been interested in modeling the relationship between cell lineages of inferred phylogenies from single-cell lineage tracing data and scRNA-seq trajectories (the partial ordering of cells induced by pseudotimes). We have found that by using a novel irreversible continuous state model of pseudotime on a rooted tree that we are better able to model unobserved ancestral pseudotimes in simulated and real phylogenies.

Dmitry Paramonov will present his FPO "Handling Data Too Large To Handle: On Multi-pass Streaming and Interactive Coding" (Friend 006)

Date and Time

Tuesday, May 7, 2024 - 11:00am to 1:00pm

Location

Friend Center 006

Type

FPO

Dmitry Paramonov will present his FPO "Handling Data Too Large To Handle: On Multi-pass Streaming and Interactive Coding" on May 7th, 2024 at 11am in Friend 006.

His committee is as follows:

Examiners: Gillat Kol (adviser), Ran Raz, Mark Braverman

Readers: Huacheng Yu and Matt Weinberg

Title: Handling Data Too Large To Handle: On Multi-pass Streaming and Interactive Coding

Abstract:

Over the last decades, the world has become increasingly more information-centric and massive

amounts of data, potentially distributed between many different sources, are being processed all the

time. In this thesis, I consider two mechanisms for coping with big data and the distributed nature

of timely tasks.

In Part I, I showcase my work on the streaming setting, where the input to the algorithm is

given as a stream of elements. The algorithm’s goal is to compute a value that depends on the

stream while only utilizing memory that is much smaller than the entire stream. My work in this

field focuses on proving that various fundamental graph problems essentially require the streaming

algorithm to store the entire graph, even if it is allowed to make several passes through the given

stream of edges.

In Part II, I consider error-correcting codes for distributed, interactive settings. Classical error-

correcting codes assume that a sender who has all the information wishes to send it to a receiver over

a noisy channel. However, in many modern, big data applications, the information is distributed

amongst many parties that communicate back-and-forth to compute a value that depends on all

their inputs. My work examines the noise resilience of various such settings. For some models,

we can design error-correcting protocols that allow the encoding of every noiseless protocol by a

noise-resilient protocol with low overhead, whereas for other models, it can be shown that this task

is impossible.

While these two topics appear greatly unrelated, and almost orthogonal to one another, the tools

used to prove results in both turn out to be remarkably similar, with many standard problems and

information theory lemmas being a critical part of both.

Ashwini Raina will present his FPO "Rethinking System Design with Awareness for cross-layer aspects of datacenter storage"

Date and Time

Friday, May 3, 2024 - 10:30am to 12:30pm

Location

Not yet determined.

Type

FPO

Ashwini Raina will present his FPO "Rethinking System Design with Awareness for cross-layer aspects of datacenter storage" on Friday, May 3, 2024 at 10:30am in CS 302.

The members of his committee are as follows:

Examiners: Michael Freedman (adviser), Wyatt Lloyd, Ravi Netravali

Readers: Amit Levy, Asaf Cidon (Columbia University)

Abstract follows below.

Storage is a critical piece of infrastructure in modern web applications. In recent years, storage technologies employed in building such systems have undergone significant evolution, bringing about novel cost-performance trade-offs. Concurrently, datacenter storage architectures have become increasingly layered. Software systems designed based on outdated assumptions of datacenter storage often result in poor cost-performance trade-offs or suffer from suboptimal performance. This dissertation proposes a new design approach for systems, one that incorporates the awareness of cross-layer aspects of datacenter storage, and validates the effectiveness of this approach through two systems. The first system is PrismDB, a novel key-value store that exploits two extreme ends of the spectrum of modern NVMe storage technologies (3D XPoint and QLC NAND) simultaneously. In recent years, emerging storage technologies have focused on divergent goals: better performance or lower cost. Correspondingly, data systems that employ these technologies are typically optimized either to be fast (but expensive) or cheap (but slow). PrismDB take a different approach: by architecting a storage engine to natively utilize two tiers of fast and low-cost storage technologies, it shows that a Pareto-efficient balance between performance and cost can be achieved. The second system is Fusion, an object store for analytics that is optimized for query pushdown on erasure-coded data. Computation pushdown is a widely adopted technique to reduce latency of highly selective queries in modern OLAP cloud database running on disaggregated storage. However, existing pushdown solutions are inefficient on erasure-coded storage since the analytics file objects get partitioned across storage nodes. Consequently, the storage system must reassemble the object across nodes before executing the query, leading to significant network latency. Fusion addresses this problem by co-designing its erasure coding and file placement topologies, taking into account popular analytics file formats (e.g., Parquet).

It employs a novel stripe construction algorithm that prevents the fragmentation of computable units within an object, and minimizes storage overhead during erasure coding. Overall, this dissertation advocates for designing software systems with an awareness of cross-layer aspects in datacenter storage, and demonstrates the benefits of that approach via two systems: PrismDB and Fusion.

Zheng Shi FPO

Date and Time

Friday, May 3, 2024 - 3:00pm to 5:00pm

Location

Not yet determined.

Type

FPO

Zheng Shi will present her FPO "Task-Specific Computational Cameras" on Friday, May 3, 2024 at 3:00 PM in CS 302.

Location: CS 302

The members of Zheng’s committee are as follows:
Examiners: Felix Heide (Adviser), Adam Finkelstein, Olga Russakovsky
Readers: Ellen Zhong, Tian-Ming Fu

A copy of his thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend her talk.

Abstract follows below:
Machine vision, while fundamentally relying on images as inputs, has traditionally treated image acquisition and image processing as two separate tasks. However, traditional image acquisition systems are tuned for aesthetics, photos that please the human eye, and not computational tasks that requires beyond human vision. My research focuses on developing task-specific computational imaging systems to enable the capture of information that extends beyond the capabilities of standard RGB cameras, thereby enhancing the effectiveness of downstream machine vision applications.

This thesis begins with combining multiple imaging modalities to facilitate training on unpaired real-world datasets, addressing the scarcity of supervised training data. We introduce ZeroScatter, a single-image descattering method capable of removing adverse weather effects from RGB captures. By integrating model-based, temporal, and multi-view cues, as well as information contained in gated imager captures, we offer indirect supervision for training on real-world adverse weather captures lacking ground truth. This approach significantly enhances generalizability on unseen data, surpassing methods trained exclusively on synthetic adverse weather data.

Despite its great applicability, relying solely on conventional RGB image inputs limits available information, and requires the model to fill in gaps by generating plausible inferences based on learnt prior, such as when car window wiper obscure objects from the dash cameras. To bypass these constraints, we shift towards computational cameras, and design specialized flat optics to boost the capabilities of cameras for a range of applications.

We first propose a computational monocular camera that optically cloaks unwanted near-camera obstructions. We learn a custom diffractive optical element (DOE) that performs depth-dependent optical encoding, scattering nearby occlusions while allowing paraxial wavefronts emanating from background objects to be focused. This allows us to computationally reconstruct unobstructed images without requiring captures different camera views or hallucinations.

Lastly, we introduce a split-aperture 2-in-1 computational camera that combines application-specific optical modulation with conventional imaging into one system. This approach simplifies complex inverse problems faced by computational cameras, enhances reconstruction quality, and offers a real-time viewfinder experience; paving the way for the adoption of computational camera technology in consumer devices.

Shunyu Yao FPO

Date and Time

Thursday, May 2, 2024 - 10:00am to 12:00pm

Location

Computer Science Small Auditorium (Room 105)

Type

FPO

Shunyu Yao will present his FPO "Language Agents: From Next-Token Prediction to Digital Automation" on Thursday, May 2, 2024 at 10:00 AM in CS 105 and Zoom.

Location: Zoom link: http://princeton.zoom.us/my/shunyuy

The members of Shunyu’s committee are as follows:
Examiners: Karthik Narasimhan (Adviser), Tom Griffiths, Benjamin Eysenbach
Readers: Sanjeev Arora, Tatsunori Hashimoto (Stanford)

A copy of his thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend his talk.

Abstract follows below:
Building autonomous agents to interact with world lies at the core of artificial intelligence (AI). This thesis introduces “language agents”, a new category of agents that utilize large language models (LLMs) to reason to act, marking a departure from traditional agents via extensive rule design or learning. It is developed in three parts:

Part I motivates the necessity for language agents by introducing a new set of AI problems and benchmarks based on interaction with large-scale, real-world computer environments, such as the Internet or code interfaces. These “digital automation” tasks present tremendous values for alleviating tedious labor and improving our life, yet pose significant challenges for prior agent or LLM methods in decision-making over open-ended natural language and long horizon, calling for new methodology.

Part II lays the methodological foundation for language agents, where the key idea is to apply LLM reasoning for versatile and generalizable agent acting and planning, which also augments LLM reasoning to be more grounded and deliberate via external feedback and internal control. We show language agents can solve a diversity of language and agent tasks (especially digital automation tasks proposed in Part I), with notable improvements over prior LLM-based methods and traditional agents.

Part III consolidates insights from Parts I and II and outlines a principled framework for language agents. The framework provides modular abstractions to organize various LLM-based methods, to understand their gaps from human cognition, and to inspire and develop new methods towards general-purpose autonomous agents.

From foundational empirical tasks and methods to a unifying conceptual framework, this thesis establishes the study of language agents as a distinct and rigorously defined field at the frontier of AI research.