04-03
Efficient Methods and Hardware for Deep Learning

Deep learning has spawned a wide range of AI applications that are changing our lives. However, deep neural networks are both computationally and memory intensive, thus they are power hungry when deployed on embedded systems and data centers with a limited power budget. To address this problem, I will present an algorithm and hardware co-design methodology for improving the efficiency of deep learning.

I will first introduce "Deep Compression", which can compress deep neural network models by 10–49× without loss of prediction accuracy for a broad range of CNN, RNN, and LSTMs. The compression reduces both computation and storage. Next, by changing the hardware architecture and efficiently implementing Deep Compression, I will introduce EIE, the Efficient Inference Engine, which can perform decompression and inference simultaneously, saving a significant amount of memory bandwidth. By taking advantage of the compressed model and being able to deal with an irregular computation pattern efficiently, EIE achieves 13× speedup and 3000× better energy efficiency over GPU. Finally, I will revisit the inefficiencies in current learning algorithms, present DSD training, and discuss the challenges and future work in efficient methods and hardware for deep learning.

Song Han is a Ph.D. candidate supervised by Prof. Bill Dally at Stanford University. His research focuses on energy-efficient deep learning, at the intersection between machine learning and computer architecture. He proposed the Deep Compression algorithm, which can compress neural networks by 10–49× while fully preserving prediction accuracy. He designed the first hardware accelerator that can perform inference directly on a compressed sparse model, which results in significant speedup and energy saving. His work has been featured by O’Reilly, TechEmergence, TheNextPlatform, and Embedded Vision, and it has impacted the industry. He led research efforts in model compression and hardware acceleration that won the Best Paper Award at ICLR’16 and the Best Paper Award at FPGA’17. Before joining Stanford, Song graduated from Tsinghua University.

Date and Time

Monday April 3, 2017 4:30pm - 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Event Type

CS Department Colloquium Series

Speaker

Song Han, from Stanford University

Host

Prof. Kai Li and Prof. David Wentzlaff, Computer Science and Electrical Engineering Departments

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

04-03 Efficient Methods and Hardware for Deep Learning

04-03
Efficient Methods and Hardware for Deep Learning