04-28
Knowledge-lean Approaches to Natural Language Processing

In order to enable computers to understand and use natural language, a massive amount of knowledge, linguistic and otherwise, must be amassed. As a result, much recent research has focused on creating systems that automatically learn high-quality information about language, and about the world, directly from the statistics of unprocessed or minimally-processed language samples alone. As examples, I will focus on two lines of work. The first uses information-theoretic distributional clustering methods trained on large language samples to induce probabilistic models of linguistic co-occurrences. The second utilizes multiple-sequence-alignment algorithms, commonly employed in computational biology, to learn how to generate English versions of computer-generated proofs, creating texts whose quality rivals that of hand-crafted systems.

Portions of this talk are based on joint work with Regina Barzilay and with Fernando Pereira.

Date and Time
Wednesday April 28, 2004 4:00pm - 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Speaker
Lillian Lee, from Cornell University
Host
Robert Schapire

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List