04-04
Distinguished Lecture Series: Hallucination in Text Summarization: From News to Narrative

The advent of large language models promises a new level of performance in generation of text of all kinds, enabling generation of text that is far more fluent, coherent and relevant than was previously possible. They also introduce a major new problem: they wholly hallucinate facts out of thin air. When summarizing an input document, they may incorrectly intermingle facts from the input, they may introduce facts that were not mentioned at all, and worse yet, they may even make up things that are not true in the real world. In this talk, I will discuss our work in characterizing the kinds of errors that can occur and methods that we have developed to help mitigate hallucination in language modeling approaches to text summarization. I will show how the level of hallucination varies when summarizing different genres and when performing different summarization tasks. While hallucination is less of a problem for summarizing a single news article, it is quite problematic when we move to summarization of short stories. When summarizing different perspectives present in multiple news articles on the same event, hallucination is also problematic.

Bio: Kathleen R. McKeown is the Henry and Gertrude Rothschild Professor of Computer Science at Columbia University and the Founding Director of the Data Science Institute, serving as Director from 2012 to 2017. In earlier years, she served as Department Chair (1998-2003) and as Vice Dean for Research for the School of Engineering and Applied Science (2010-2012). A leading scholar and researcher in the field of natural language processing, McKeown focuses her research on the use of data for societal problems; her interests include text summarization, question answering, natural language generation, social media analysis and multilingual applications. She has received numerous honors and awards, including 2023 IEEE Innovation in Societal Infrastructure Award, American Philosophical Society Elected member, American Academy of Arts and Science elected member, American Association of Artificial Intelligence Fellow, a Founding Fellow of the Association for Computational Linguistics and an Association for Computing Machinery Fellow. Early on she received the National Science Foundation Presidential Young Investigator Award, and a National Science Foundation Faculty Award for Women. In 2010, she won both the Columbia Great Teacher Award—an honor bestowed by the students—and the Anita Borg Woman of Vision Award for Innovation.

Reception to follow the talk.

Sponsor

Event organized by AI Lab

Date and Time

Friday April 4, 2025 2:00pm - 3:30pm

Location

Robertson Hall 016

Event Type

Princeton Language and Intelligence

Speaker

Kathleen R. McKeown, from Columbia Universtiy

Host

PLI

Website

https://pli.princeton.edu/events/2025/distinguished-lecture-series-hallucination-text-summarization-news-narrative

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

04-04 Distinguished Lecture Series: Hallucination in Text Summarization: From News to Narrative

04-04
Distinguished Lecture Series: Hallucination in Text Summarization: From News to Narrative