Quick links

FPO

Pranay Manocha FPO

Date and Time
Wednesday, May 1, 2024 - 1:30pm to 3:30pm
Location
Computer Science 401
Type
FPO

Pranay Manocha will present his FPO "Do we need a Reference Signal for Speech Quality Assessment?" on Wednesday, May 1, 2024 at 1:30 PM in CS 401.

Location: CS 401

The members of Pranay’s committee are as follows:
Examiners: Adam Finkelstein (Adviser), Szymon Rusinkiewicz, Paul Calamia (Meta Reality Labs Research)
Readers: Karthik Narasimhan, Zeyu Jin (Adobe Research)

A copy of his thesis is available upon request.  Please email  if you would like a copy of the thesis.
 
Everyone is invited to attend his talk.
 
Abstract follows below:
This thesis investigates new metrics for assessing speech quality that aim to align more closely with human auditory perception than current methods. It aims to improve the techniques and understanding of speech quality evaluation. It considers traditional methods that compare speech to a perfect (clean) reference and introduces new approaches for scenarios where such a reference is not available. It also emphasizes the significance of reference signals and explores the necessity for flexible evaluation techniques that can function effectively without an ideal reference. The dissertation describes three main categories of metrics: full-reference (FR), no-reference (NR), and non-matching reference (NMR), providing a detailed comparison of their benefits and limitations. Despite the general preference for FR metrics in situations where a corresponding clean reference signal is available, this research identifies specific circumstances where FR metrics may not be the most effective approach, thereby highlighting the utility and relevance of NMR metrics across different evaluative scenarios. Another contribution of this thesis is the introduction of CoRN, a novel metric formulated through the integration of FR, and NR metrics. This metric builds on an exhaustive analysis of various evaluation metrics, demonstrating its utility in advancing audio quality assessment. Additionally, applying these methods to spatial audio in augmented and virtual reality settings expands the thesis’s contribution to the more general domain of audio quality assessment. This thesis aims to improve the techniques and understanding of speech quality evaluation. This dissertation aims to refine and expand the methodologies and understanding of speech quality evaluation, a crucial step for the evolution of digital communication technologies.

Ted Sumers FPO

Date and Time
Wednesday, April 24, 2024 - 11:00am to 1:00pm
Location
Computer Science 301
Type
FPO

Ted Sumers will present his FPO "Grounding Communication in Real-World Action" on Wednesday, April 24, 2024 at 11:00 AM in CS 301

Location: CS 301

The members of Ted’s committee are as follows:
Examiners: Tom Griffiths (Adviser), Ryan Adams, Adele Goldberg
Readers: Karthik Narasimhan, Dylan Hadfield-Menell (MIT), Tom Griffiths (Adviser)

A copy of his thesis is available upon request.  Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend his talk.

Abstract follows below:
This dissertation bridges psychology and artificial intelligence (AI) to develop agents capable of learning through communication with humans. The first half establishes a foundation by comparing the efficacy of language and demonstration for transmitting complex concepts. Experiments reveal language’s superior ability to convey abstract rules, suggesting its importance for social learning. I then connect computational models of pragmatic language understanding to reinforcement learning settings, grounding a speaker’s utility in their listener’s decision problem. Behavioral evidence validates this as a model of human language use.

Building on these insights, the second half develops AI agents capable of learning from such language. I first extend the computational model to incorporate both commands and teaching. Experiments show this allows an AI listener to robustly infer the human’s latent reward function. I then introduce the problem of learning from fully natural language and contribute two novel approaches: utilizing aspect-based sentiment analysis and a inference network learned end-to-end. Behavioral evaluations demonstrate these models successfully learn from interactive human feedback.

Together, this dissertation provides a formal computational theory of the cognitive mechanisms supporting human social learning and embeds them in artificial agents. I discuss implications both for large language models and the continued development of AI agents that acquire and use information through genuine dialogue. This work suggests that building machines to learn as humans do – socially and linguistically – is a promising path towards beneficial artificial intelligence.

Fangyin Wei FPO

Date and Time
Monday, April 22, 2024 - 1:30pm to 3:30pm
Location
Computer Science 302
Type
FPO

Fangyin Wei will present her FPO "Learning to Edit 3D Objects and Scenes" on Monday, April 22, 2024 at 1:30 PM in CS 302

Location: CS 302

The members of Fangyin’s committee are as follows:
Examiners: Szymon Rusinkiewicz (Adviser), Thomas Funkhouser (Adviser), Jia Deng
Readers: Felix Heide, Olga Russakovsky

A copy of her thesis is available upon request.  Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend her talk.

Abstract follows below:
3D editing plays a key role in many fields ranging from AR/VR, industrial and art design, to robotics. However, existing 3D editing tools either (i) demand labor-intensive manual efforts and struggle to scale to many examples, or (ii) use optimization and machine learning but produce unsatisfactory results (e.g., losing details, supporting only coarse editing, etc.). These shortcomings often arise from editing in geometric space rather than structure-aware semantic space, where the latter is the key to automatic 3D editing at scale. While learning a structure-aware space will result in significantly improved efficiency and accuracy, labeled datasets to train 3D editing models don’t exist. In this dissertation, we present novel approaches for learning to edit 3D objects and scenes in structure-aware semantic space with noisy or no supervision.

We first address how to extract the underlying structure to edit 3D objects, with a focus on editing two critical properties: semantic shape parts and articulations.

Our semantic editing method enables specific edits to an object’s semantic parameters (e.g., the pose of a person’s arm or the length of an airplane’s wing), leading to better preservation of input details and improved accuracy compared to previous work.

Next, we introduce a 3D annotation-free method that learns to model geometry, articulation, and appearance of articulated objects from color images. The model works on an entire category (as opposed to typical NeRF extensions that only overfit on a single scene) and enables various applications such as few-shot reconstruction and static object animation. It also generalizes to real-world captures.

Then, we tackle how to extract structure for scene editing. We present an automatic system that removes clutter (frequently moving objects such as clothes or chairs) from 3D scenes and inpaints the resulting holes with coherent geometry and texture. We address challenges including the lack of well-defined clutter annotations, entangled semantics and geometry, and multi-view inconsistency.

In summary, this dissertation demonstrates techniques to exploit the underlying structure of 3D data for editing. Our work opens up new research directions such as leveraging structures from other modalities (e.g., text, images) to empower 3D editing models with stronger semantic understanding.

Marcelo Orenes Vera FPO

Date and Time
Friday, May 3, 2024 - 11:00am to 1:00pm
Location
Computer Science Small Auditorium (Room 105)
Type
FPO

details forthcoming

Uma Girish FPO

Date and Time
Thursday, May 2, 2024 - 11:00am to 1:00pm
Location
Computer Science Tea Room
Type
FPO

details forthcoming

Ben Burgess FPO

Date and Time
Tuesday, May 7, 2024 - 2:15pm to 4:15pm
Location
Not yet determined.
Type
FPO

details forthcoming

Mary Hogan FPO

Date and Time
Thursday, May 9, 2024 - 1:00pm to 3:00pm
Location
Friend Center 125
Type
FPO

details forthcoming

Kritkorn Karntikoon FPO

Date and Time
Thursday, May 9, 2024 - 10:30am to 12:30pm
Location
Computer Science 402
Type
FPO

details forthcoming

Kathy Chen FPO

Date and Time
Tuesday, April 9, 2024 - 10:00am to 12:00pm
Location
Carl Icahn Lab 280
Type
FPO

Kathy Chen will present her FPO "Decoding the sequence basis of gene regulation" on Tuesday, April 9, 2024 at 10:00 AM in Icahn 280 and Zoom.

Location: Zoom link: https://princeton.zoom.us/j/95565860844?pwd=LzNkcHZ6bnVackxZRnZJSitXdW9NUT09

The members of Kathy’s committee are as follows:
Examiners: Olga Troyanskaya (Adviser), Mona Singh, Kai Li
Readers: Ryan Adams, Jian Zhou (UT Southwestern)

Everyone is invited to attend her talk.  

Abstract follows below:

Deciphering the regulatory code of gene expression is a critical challenge in human genetics, instrumental to unlocking the potential of personalized medicine. Modern experimental technologies have resulted in an abundance of high-dimensional genome-wide data, revealing the complex system of epigenetic interactions encoded in the genome. The development of computational approaches which can leverage this vast data to model chromatin interactions globally offer a new understanding of how genomic sequences specify regulatory functions. Specifically, sequence-based deep learning models have become the de facto standard for learning the functional properties encoded in DNA sequences based on large sequencing datasets. These models are powerful tools for interpreting molecular and phenotypic effects, capable of predicting the impact of any noncoding variant in the human genome, even rare or never-before-observed variants, and systematically characterizing their consequences beyond what is tractable from experiments and quantitative genetics alone.

In this thesis, we present two deep learning-based sequence models, which predict different epigenetic properties of the genome that contribute to transcriptional regulation. First, Sei is a framework for integrating human genetics data with sequence information to discover the regulatory basis of traits and diseases. Sei learns a vocabulary of regulatory activities, called sequence classes, using a model that predicts 21,907 chromatin profiles across >1,300 cell lines and tissues. Sequence classes provide a global classification and quantification of sequences and variants based on diverse regulatory activities, such as cell type-specific enhancers.

Next, we developed a model Hedgehog, which enables the quantification of variation on methylation sites. Hedgehog predicts 296 continuous-valued methylation profiles across a range of cell types and tissues. Hedgehog is complementary to Sei and reveals new insights into the relationship between DNA methylation and other epigenetic modifications.

Finally, we show how deep learning-based methods can be applied to elucidate the regulatory basis of human health and disease. Specifically, we use Sei to study the contribution of noncoding mutations in cancer. Collectively, we demonstrate novel frameworks for modeling the sequence dependencies of the epigenome and the capability of such approaches to delineate the regulatory mechanisms underlying complex diseases.

Angelina Wang FPO

Date and Time
Monday, May 6, 2024 - 2:30pm to 4:30pm
Location
Computer Science 402
Type
FPO

Angelina Wang will present her FPO "Operationalizing Responsible Machine Learning: From Equality Towards Equity" on Monday, May 6, 2024 at 2:30 PM in CS 402.

Location: CS 402

The members of Angelina’s committee are as follows:
Examiners: Olga Russakovsky (Adviser), Arvind Narayanan, Solon Barocas (Cornell)
Readers: Aleksandra Korolova, Janet Vertesi

A copy of her thesis is available upon request.  Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.

Everyone is invited to attend her talk.

Abstract follows below:

With the widespread proliferation of machine learning, there arises both the opportunity for societal benefit as well as the risk of harm. Approaching responsible machine learning is challenging because technical approaches may prioritize a mathematical definition of fairness that correlates poorly to real-world constructs of fairness due to too many layers of abstraction. Conversely, social approaches that engage with prescriptive theories may produce findings that are too abstract to effectively translate into practice. In my research, I bridge these approaches and utilize social implications to guide technical work. I will discuss three research directions that show how, despite the technically convenient approach of considering equality acontextually, a stronger engagement with societal context allows us to operationalize a more equitable formulation. First, I will introduce a dataset tool that we developed to analyze complex, socially-grounded forms of visual bias. Then, I will provide empirical evidence to support how we should incorporate societal context in bringing intersectionality into machine learning. Finally, I will discuss how in the excitement of using LLMs for tasks like human participant replacement, we have neglected to consider the importance of human positionality. Overall, I will explore how we can expand a narrow focus on equality in responsible machine learning to encompass a broader understanding of equity that substantively engages with societal context

Follow us: Facebook Twitter Linkedin