Publications
Year 2022
- SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
G. Xiao*, J. Lin*, M. Seznec, J. Demouth, S. Han
arXiv
paper / code / integration by NVIDIA - BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation
Z. Liu*, H. Tang*, A. Amini, X. Yang, H. Mao, D. L. Rus, S. Han
ICRA’23
paper / code / website / demo - On-Device Training Under 256KB Memory
J. Lin*, L. Zhu*, W. Chen, W. Wang, C. Gan, S. Han
NeurIPS’22
paper / website / demo / code / slides / poster - Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
M. Li, J. Lin, C. Meng, S. Ermon, S. Han, J. Zhu
NeurIPS’22 - Network Augmentation for Tiny Deep Learning
H. Cai, C. Gan, J. Lin, S. Han
ICLR’22
paper / code - LitePose: Efficient Architecture Design for 2D Human Pose Estimation
Y. Wang, M. Li, H. Cai, W. Chen, S. Han
CVPR’22
paper / code - TorchSparse: Efficient Point Cloud Inference Engine
H. Tang, Z. Liu, X. Li, Y. Lin, S. Han
MLSys’22
paper / website / code - Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
H. Cai, J. Lin, Y. Lin, Z. Liu, H. Tang, H. Wang, L. Zhu, S. Han
TODAES’22
paper - QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits
H. Wang, Y. Ding, J. Gu, Z. Li, Y. Lin, D. Pan, F. Chong, S. Han
HPCA’22
paper / qmlsys website / TorchQuantum / MIT News / video - QuantumNAT: Quantum Noise-Aware Training with Noise Injection, Quantization and Normalization
H. Wang, J. Gu, Y. Ding, Z. Li, F. T. Chong, D. Z. Pan, S. Han
DAC’22
paper / qmlsys website / code - QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning
H. Wang, Z. Li, J. Gu, Y. Ding, D. Z. Pan, S. Han
DAC’22
paper / qmlsys website / code - QuEst: Graph Transformer for Quantum Circuit Reliability Prediction
H. Wang, P. Liu, J. Cheng, Z. Liang, J. Gu, Z. Li, Y. Ding, W. Jiang, Y. Shi, X. Qian, D. Z. Pan, F. T. Chong, S. Han
ICCAD’22, invited
Year 2021
- MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning
J. Lin, W. Chen, H. Cai, C. Gan, S. Han
NeurIPS’21 - Delayed Gradient Averaging: Tolerate the Communication Latency for Federated Learning
L. Zhu, H. Lin, Y Lu, Y. Lin, S. Han
NeurIPS’21 - PointAcc: Efficient Point Cloud Accelerator
Y. Lin, Z. Zhang, H. Tang, H. Wang, S. Han
MICRO’21. - LocTex: Learning Data-Efficient Visual Representations from Localized Textual Supervision
Z. Liu, S. Stent, J. Li, J. Gideon, S. Han
ICCV’21. - AlignNet: Annotation-Free Camera-LiDAR Calibration with Semantic Alignment Loss
Z. Liu, H. Tang, S. Zhu, S. Han
IROS’21. - NAAS: Neural Accelerator Architecture Search
Y. Lin, M. Yang, S. Han
DAC’21. Selected to present at VLSI Design’22. - Anycost GANs for Interactive Image Synthesis and Editing
J. Lin, R. Zhang, F. Ganz, S. Han, J. Zhu
CVPR’21. - Efficient and Robust LiDAR-Based End-to-End Navigation
Z Liu, A. Amini, S. Zhu, S. Karaman, S. Han, D. Rus
ICRA’21. - IOS: Inter-Operator Scheduler For CNN Acceleration
Y. Ding, L. Zhu, Z. Jia, G. Pekhimenko, S. Han
MLSys’21. - SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
H. Wang, Z. Zhang, S. Han
HPCA’21.
Year 2020
- MCUNet: Tiny Deep Learning on IoT Devices
J. Lin, W. Chen, Y. Lin, J. Cohn, C. Gan, S. Han
NeurIPS’20. Spotlight - Tiny Transfer Learning: Reduce Activations, not Trainable Parameters for Efficient On-Device Learning
H. Cai, C. Gan, L. Zhu, S. Han
NeurIPS’20. - Differentiable Augmentation for Data-Efficient GAN Training
S. Zhao, Z. Liu, J. Lin, J. Zhu, S. Han
NeurIPS’20. - Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
H. Tang∗, Z. Liu∗, S. Zhao, Y. Lin, J. Lin. H. Wang, S. Han
ECCV’20 - DataMix: Efficient Privacy-Preserving Edge-Cloud Inference
Z Liu∗, Z. Wu∗, C. Gan, L. Zhu, S. Han
ECCV’20. - HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
H. Wang, Z. Wu, Z. Liu, H. Cai, L. Zhu, C Gan, S. Han
ACL’20. - GAN Compression: Learning Efficient Architectures for Conditional GANs
M. Li, J. Lin, Y. Ding, Z. Liu, J. Zhu, S. Han
CVPR’20. - APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
T. Wang, K. Wang, H. Cai, J. Lin, Z. Liu, S. Han
CVPR’20. - GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning
H. Wang, K. Wang, J. Yang, L. Shen, N. Sun, H.S. Lee and S. Han
DAC’20. - SpArch: Efficient Architecture for Sparse Matrix Multiplication
Z. Zhang, H. Wang, S. Han, W.J. Dally
HPCA’20. - Once For All: Train One Network and Specialize It for Efficient Deployment
H. Cai, C. Gan, T. Wang, Z. Zhang, S. Han
ICLR’20. (also appeared at TinyML summit, SysML workshop and CVPR’20 workshop.) - Lite Transformer with Long Short Term Attention
Z. Wu, Z. Liu, J. Lin, Y. Lin, S. Han
ICLR’20.
Year 2019
- Point Voxel CNN for Efficient 3D Deep Learning
Z. Liu, H. Tang, Y. Lin, S. Han
NeurIPS’19. Spotlight - Deep Leakage from Gradients
L. Zhu, Z. Liu, S. Han
NeurIPS’19. - TSM: Temporal Shift Module for Efficient Video Understanding
J. Lin, C. Gan, S. Han.
ICCV’19. - HAQ: Hardware-Aware Automated Quantization
K. Wang*, Z. Liu*, Y. Lin*, J. Lin, Song Han.
CVPR’19. Oral presentation - ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
H. Cai, L. Zhu, S. Han
ICLR’19. - Defensive Quantization: When Efficiency Meets Robustness
J. Lin, C. Gan, S. Han
ICLR’19.
Year 2018
- AMC: AutoML for Model Compression and Acceleration on Mobile Devices.
Y. He, J. Lin, Z. Liu, H. Wang, L. Li, S. Han
ECCV’18.