Pranay Manocha FPO | Computer Science Department at Princeton University

Date and Time

Wednesday, May 1, 2024 - 1:30pm to 3:30pm

Location

Computer Science 401

Type

FPO

Pranay Manocha will present his FPO "Do we need a Reference Signal for Speech Quality Assessment?" on Wednesday, May 1, 2024 at 1:30 PM in CS 401.

Location: CS 401

The members of Pranay’s committee are as follows:
Examiners: Adam Finkelstein (Adviser), Szymon Rusinkiewicz, Paul Calamia (Meta Reality Labs Research)
Readers: Karthik Narasimhan, Zeyu Jin (Adobe Research)

A copy of his thesis is available upon request. Please email gradinfo (@cs.princeton.edu) if you would like a copy of the thesis.

Everyone is invited to attend his talk.

Abstract follows below:
This thesis investigates new metrics for assessing speech quality that aim to align more closely with human auditory perception than current methods. It aims to improve the techniques and understanding of speech quality evaluation. It considers traditional methods that compare speech to a perfect (clean) reference and introduces new approaches for scenarios where such a reference is not available. It also emphasizes the significance of reference signals and explores the necessity for flexible evaluation techniques that can function effectively without an ideal reference. The dissertation describes three main categories of metrics: full-reference (FR), no-reference (NR), and non-matching reference (NMR), providing a detailed comparison of their benefits and limitations. Despite the general preference for FR metrics in situations where a corresponding clean reference signal is available, this research identifies specific circumstances where FR metrics may not be the most effective approach, thereby highlighting the utility and relevance of NMR metrics across different evaluative scenarios. Another contribution of this thesis is the introduction of CoRN, a novel metric formulated through the integration of FR, and NR metrics. This metric builds on an exhaustive analysis of various evaluation metrics, demonstrating its utility in advancing audio quality assessment. Additionally, applying these methods to spatial audio in augmented and virtual reality settings expands the thesis’s contribution to the more general domain of audio quality assessment. This thesis aims to improve the techniques and understanding of speech quality evaluation. This dissertation aims to refine and expand the methodologies and understanding of speech quality evaluation, a crucial step for the evolution of digital communication technologies.