This page has only limited features, please log in for full access.
The demand for running NNs in embedded environments has increased significantly in recent years due to the significant success of convolutional neural network (CNN) approaches in various tasks, including image recognition and generation. The task of achieving high accuracy on resource-restricted devices, however, is still considered to be challenging, which is mainly due to the vast number of design parameters that need to be balanced. While the quantization of CNN parameters leads to a reduction of power and area, it can also generate unexpected changes in the balance between communication and computation. This change is hard to evaluate, and the lack of balance may lead to lower utilization of either memory bandwidth or computational resources, thereby reducing performance. This paper introduces a hardware performance analysis framework for identifying bottlenecks in the early stages of CNN hardware design. We demonstrate how the proposed method can help in evaluating different architecture alternatives of resource-restricted CNN accelerators (e.g., part of real-time embedded systems) early in design stages and, thus, prevent making design mistakes.
Alex Karbachevsky; Chaim Baskin; Evgenii Zheltonozhskii; Yevgeny Yermolin; Freddy Gabbay; Alex Bronstein; Avi Mendelson. Early-Stage Neural Network Hardware Performance Analysis. Sustainability 2021, 13, 717 .
AMA StyleAlex Karbachevsky, Chaim Baskin, Evgenii Zheltonozhskii, Yevgeny Yermolin, Freddy Gabbay, Alex Bronstein, Avi Mendelson. Early-Stage Neural Network Hardware Performance Analysis. Sustainability. 2021; 13 (2):717.
Chicago/Turabian StyleAlex Karbachevsky; Chaim Baskin; Evgenii Zheltonozhskii; Yevgeny Yermolin; Freddy Gabbay; Alex Bronstein; Avi Mendelson. 2021. "Early-Stage Neural Network Hardware Performance Analysis." Sustainability 13, no. 2: 717.
Deep neural networks are known to be vulnerable to inputs with maliciously constructed adversarial perturbations aimed at forcing misclassification. We study randomized smoothing as a way to both improve performance on unperturbed data as well as increase robustness to adversarial attacks. Moreover, we extend the method proposed by arXiv:1811.09310 by adding low-rank multivariate noise, which we then use as a base model for smoothing. The proposed method achieves 58.5% top-1 accuracy on CIFAR-10 under PGD attack and outperforms previous works by 4%. In addition, we consider a family of attacks, which were previously used for training purposes in the certified robustness scheme. We demonstrate that the proposed attacks are more effective than PGD against both smoothed and non-smoothed models. Since our method is based on sampling, it lends itself well for trading-off between the model inference complexity and its performance. A reference implementation of the proposed techniques is provided at https://github.com/yanemcovsky/SIAM.
Yaniv Nemcovsky; Evgenii Zheltonozhskii; Chaim Baskin; Brian Chmiel; Alex M. Bronstein; Avi Mendelson. Smoothed Inference for Adversarially-Trained Models. 2019, 1 .
AMA StyleYaniv Nemcovsky, Evgenii Zheltonozhskii, Chaim Baskin, Brian Chmiel, Alex M. Bronstein, Avi Mendelson. Smoothed Inference for Adversarially-Trained Models. . 2019; ():1.
Chicago/Turabian StyleYaniv Nemcovsky; Evgenii Zheltonozhskii; Chaim Baskin; Brian Chmiel; Alex M. Bronstein; Avi Mendelson. 2019. "Smoothed Inference for Adversarially-Trained Models." , no. : 1.
Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or above). In this work, we study the effect of quantization on the structure of the loss landscape. We show that the structure is flat and separable for mild quantization, enabling straightforward post-training quantization methods to achieve good results. On the other hand, we show that with more aggressive quantization, the loss landscape becomes highly non-separable with sharp minima points, making the selection of quantization parameters more challenging. Armed with this understanding, we design a method that quantizes the layer parameters jointly, enabling significant accuracy improvement over current post-training quantization methods. Reference implementation accompanies the paper at https://github.com/ynahshan/nn-quantization-pytorch/tree/master/lapq
Yury Nahshan; Brian Chmiel; Chaim Baskin; Evgenii Zheltonozhskii; Ron Banner; Alex M. Bronstein; Avi Mendelson. Loss Aware Post-training Quantization. 2019, 1 .
AMA StyleYury Nahshan, Brian Chmiel, Chaim Baskin, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson. Loss Aware Post-training Quantization. . 2019; ():1.
Chicago/Turabian StyleYury Nahshan; Brian Chmiel; Chaim Baskin; Evgenii Zheltonozhskii; Ron Banner; Alex M. Bronstein; Avi Mendelson. 2019. "Loss Aware Post-training Quantization." , no. : 1.
Convolutional Neural Networks (CNN) has become more popular choice for various tasks such as computer vision, speech recognition and natural language processing. Thanks to their large computational capability and throughput, GPUs ,which are not power efficient and therefore does not suit low power systems such as mobile devices, are the most common platform for both training and inferencing tasks. Recent studies has shown that FPGAs can provide a good alternative to GPUs as a CNN accelerator, due to their re-configurable nature, low power and small latency. In order for FPGA-based accelerators outperform GPUs in inference task, both the parameters of the network and the activations must be quantized. While most works use uniform quantizers for both parameters and activations, it is not always the optimal one, and a non-uniform quantizer need to be considered. In this work we introduce a custom hardware-friendly approach to implement non-uniform quantizers. In addition, we use a single scale integer representation of both parameters and activations, for both training and inference. The combined method yields a hardware efficient non-uniform quantizer, fit for real-time applications. We have tested our method on CIFAR-10 and CIFAR-100 image classification datasets with ResNet-18 and VGG-like architectures, and saw little degradation in accuracy.
Natan Liss; Chaim Baskin; Avi Mendelson; Alex M. Bronstein; Raja Giryes. Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware. 2018, 1 .
AMA StyleNatan Liss, Chaim Baskin, Avi Mendelson, Alex M. Bronstein, Raja Giryes. Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware. . 2018; ():1.
Chicago/Turabian StyleNatan Liss; Chaim Baskin; Avi Mendelson; Alex M. Bronstein; Raja Giryes. 2018. "Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware." , no. : 1.