Searching and Browsing Visual Data
In particular, I’ll first introduce attribute representations that connect visual properties to human describable terms. I’ll show how these attributes enable both fine-grained content-based retrieval as well as new forms of human supervision for recognition problems. Then, I’ll overview our recent work on video summarization, where the goal is to automatically transform a long video into a short one. Using videos captured with egocentric wearable cameras, we’ll see how hours of data can be distilled to a succinct visual storyboard that is understandable in just moments. Together, these ideas are promising steps towards widening the channel of communication between humans and computer vision algorithms, which is critical to facilitate efficient browsing of large-scale image and video collections.
This is work done with Adriana Kovashka, Yong Jae Lee, Devi Parikh, Lu Zheng, Bo Xiong, and Dinesh Jayaraman.
Kristen Grauman is an Associate Professor in the Department of Computer Science at the University of Texas at Austin. Her research in computer vision and machine learning focuses on visual search and object recognition. Before joining UT-Austin in 2007, she received her Ph.D. in the EECS department at MIT, in the Computer Science and Artificial Intelligence Laboratory. She is an Alfred P. Sloan Research Fellow and Microsoft Research New Faculty Fellow, a recipient of NSF CAREER and ONR Young Investigator awards, the Regents' Outstanding Teaching Award from the University of Texas System in 2012, the PAMI Young Researcher Award in 2013, the 2013 Computers and Thought Award from the International Joint Conference on Artificial Intelligence, and a Presidential Early Career Award for Scientists and Engineers (PECASE) in 2013. She and her collaborators were recognized with the CVPR Best Student Paper Award in 2008 for their work on hashing algorithms for large-scale image retrieval, and the Marr Best Paper Prize at ICCV in 2011 for their work on modeling relative visual attributes.