..

Международный журнал сенсорных сетей и передачи данных

Отправить рукопись arrow_forward arrow_forward ..

A Compiler-Aware Framework of Network Pruning and Architecture Search for Mobile Acceleration

Abstract

Zhengang Li

With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. Prior methods towards this goal, including model compression and network architecture search (NAS), are largely performed independently and do not fully consider compiler-level optimization which is a must-do for mobile acceleration. In this work, we propose NPAS, a compiler-aware unified network pruning and architecture search and the corresponding comprehensive compiler optimizations supporting different DNNs and different pruning schemes, which bridge the gap of weight pruning and NAS. Our framework achieves 6.7 ms, 5.9 ms, and 3.9 ms ImageNet inference times with 78%, 75% (MobileNet-V3 level), and 71% (MobileNet-V2 level) Top-1 accuracy respectively on an off-the-shelf mobile phone, consistently outperforming prior work.

Отказ от ответственности: Этот реферат был переведен с помощью инструментов искусственного интеллекта и еще не прошел проверку или верификацию

Поделиться этой статьей

Индексировано в

arrow_upward arrow_upward