This page has only limited features, please log in for full access.

Unclaimed
Evgenii Zheltonozhskii
Technion, Technion, Haifa

Basic Info

Basic Info is private.

Honors and Awards

The user has no records in this section


Career Timeline

The user has no records in this section.


Short Biography

The user biography is not available.
Following
Followers
Co Authors
The list of users this user is following is empty.
Following: 0 users

Feed

Journal article
Published: 01 June 2021 in ACM Transactions on Computer Systems
Reads 0
Downloads 0

We present a novel method for neural network quantization. Our method, named UNIQ , emulates a non-uniform k -quantile quantizer and adapts the model to perform well with quantized weights by injecting noise to the weights at training time. As a by-product of injecting noise to weights, we find that activations can also be quantized to as low as 8-bit with only a minor accuracy degradation. Our non-uniform quantization approach provides a novel alternative to the existing uniform quantization techniques for neural networks. We further propose a novel complexity metric of number of bit operations performed (BOPs), and we show that this metric has a linear relation with logic utilization and power. We suggest evaluating the trade-off of accuracy vs. complexity (BOPs). The proposed method, when evaluated on ResNet18/34/50 and MobileNet on ImageNet, outperforms the prior state of the art both in the low-complexity regime and the high accuracy regime. We demonstrate the practical applicability of this approach, by implementing our non-uniformly quantized CNN on FPGA.

ACS Style

Chaim Baskin; Natan Liss; Eli Schwartz; Evgenii Zheltonozhskii; Raja Giryes; Alex M. Bronstein; Avi Mendelson. UNIQ. ACM Transactions on Computer Systems 2021, 37, 1 -15.

AMA Style

Chaim Baskin, Natan Liss, Eli Schwartz, Evgenii Zheltonozhskii, Raja Giryes, Alex M. Bronstein, Avi Mendelson. UNIQ. ACM Transactions on Computer Systems. 2021; 37 (1-4):1-15.

Chicago/Turabian Style

Chaim Baskin; Natan Liss; Eli Schwartz; Evgenii Zheltonozhskii; Raja Giryes; Alex M. Bronstein; Avi Mendelson. 2021. "UNIQ." ACM Transactions on Computer Systems 37, no. 1-4: 1-15.

Journal article
Published: 13 January 2021 in Sustainability
Reads 0
Downloads 0

The demand for running NNs in embedded environments has increased significantly in recent years due to the significant success of convolutional neural network (CNN) approaches in various tasks, including image recognition and generation. The task of achieving high accuracy on resource-restricted devices, however, is still considered to be challenging, which is mainly due to the vast number of design parameters that need to be balanced. While the quantization of CNN parameters leads to a reduction of power and area, it can also generate unexpected changes in the balance between communication and computation. This change is hard to evaluate, and the lack of balance may lead to lower utilization of either memory bandwidth or computational resources, thereby reducing performance. This paper introduces a hardware performance analysis framework for identifying bottlenecks in the early stages of CNN hardware design. We demonstrate how the proposed method can help in evaluating different architecture alternatives of resource-restricted CNN accelerators (e.g., part of real-time embedded systems) early in design stages and, thus, prevent making design mistakes.

ACS Style

Alex Karbachevsky; Chaim Baskin; Evgenii Zheltonozhskii; Yevgeny Yermolin; Freddy Gabbay; Alex Bronstein; Avi Mendelson. Early-Stage Neural Network Hardware Performance Analysis. Sustainability 2021, 13, 717 .

AMA Style

Alex Karbachevsky, Chaim Baskin, Evgenii Zheltonozhskii, Yevgeny Yermolin, Freddy Gabbay, Alex Bronstein, Avi Mendelson. Early-Stage Neural Network Hardware Performance Analysis. Sustainability. 2021; 13 (2):717.

Chicago/Turabian Style

Alex Karbachevsky; Chaim Baskin; Evgenii Zheltonozhskii; Yevgeny Yermolin; Freddy Gabbay; Alex Bronstein; Avi Mendelson. 2021. "Early-Stage Neural Network Hardware Performance Analysis." Sustainability 13, no. 2: 717.

Conference paper
Published: 01 July 2020 in 2020 International Joint Conference on Neural Networks (IJCNN)
Reads 0
Downloads 0

Convolutional neural networks (CNNs) achieve state-of-the-art accuracy in a variety of tasks in computer vision and beyond. One of the major obstacles hindering the ubiquitous use of CNNs for inference on low-power edge devices is their high computational complexity and memory bandwidth requirements. The latter often dominates the energy footprint on modern hardware. In this paper, we introduce a lossy transform coding approach, inspired by image and video compression, designed to reduce the memory bandwidth due to the storage of intermediate activation calculation results. Our method does not require fine-tuning the network weights and halves the data transfer volumes to the main memory by compressing feature maps, which are highly correlated, with variable length coding. Our method outperform previous approach in term of the number of bits per value with minor accuracy degradation on ResNet-34 and MobileNetV2. We analyze the performance of our approach on a variety of CNN architectures and demonstrate that FPGA implementation of ResNet-18 with our approach results in a reduction of around 40% in the memory energy footprint, compared to quantized network, with negligible impact on accuracy. When allowing accuracy degradation of up to 2%, the reduction of 60% is achieved. A reference implementation accompanies the paper.

ACS Style

Brian Chmiel; Chaim Baskin; Evgenii Zheltonozhskii; Ron Banner; Yevgeny Yermolin; Alex Karbachevsky; Alex M. Bronstein; Avi Mendelson. Feature Map Transform Coding for Energy-Efficient CNN Inference. 2020 International Joint Conference on Neural Networks (IJCNN) 2020, 1 -9.

AMA Style

Brian Chmiel, Chaim Baskin, Evgenii Zheltonozhskii, Ron Banner, Yevgeny Yermolin, Alex Karbachevsky, Alex M. Bronstein, Avi Mendelson. Feature Map Transform Coding for Energy-Efficient CNN Inference. 2020 International Joint Conference on Neural Networks (IJCNN). 2020; ():1-9.

Chicago/Turabian Style

Brian Chmiel; Chaim Baskin; Evgenii Zheltonozhskii; Ron Banner; Yevgeny Yermolin; Alex Karbachevsky; Alex M. Bronstein; Avi Mendelson. 2020. "Feature Map Transform Coding for Energy-Efficient CNN Inference." 2020 International Joint Conference on Neural Networks (IJCNN) , no. : 1-9.

Preprint
Published: 17 November 2019
Reads 0
Downloads 0

Deep neural networks are known to be vulnerable to inputs with maliciously constructed adversarial perturbations aimed at forcing misclassification. We study randomized smoothing as a way to both improve performance on unperturbed data as well as increase robustness to adversarial attacks. Moreover, we extend the method proposed by arXiv:1811.09310 by adding low-rank multivariate noise, which we then use as a base model for smoothing. The proposed method achieves 58.5% top-1 accuracy on CIFAR-10 under PGD attack and outperforms previous works by 4%. In addition, we consider a family of attacks, which were previously used for training purposes in the certified robustness scheme. We demonstrate that the proposed attacks are more effective than PGD against both smoothed and non-smoothed models. Since our method is based on sampling, it lends itself well for trading-off between the model inference complexity and its performance. A reference implementation of the proposed techniques is provided at https://github.com/yanemcovsky/SIAM.

ACS Style

Yaniv Nemcovsky; Evgenii Zheltonozhskii; Chaim Baskin; Brian Chmiel; Alex M. Bronstein; Avi Mendelson. Smoothed Inference for Adversarially-Trained Models. 2019, 1 .

AMA Style

Yaniv Nemcovsky, Evgenii Zheltonozhskii, Chaim Baskin, Brian Chmiel, Alex M. Bronstein, Avi Mendelson. Smoothed Inference for Adversarially-Trained Models. . 2019; ():1.

Chicago/Turabian Style

Yaniv Nemcovsky; Evgenii Zheltonozhskii; Chaim Baskin; Brian Chmiel; Alex M. Bronstein; Avi Mendelson. 2019. "Smoothed Inference for Adversarially-Trained Models." , no. : 1.

Conference paper
Published: 01 May 2018 in 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
Reads 0
Downloads 0

Deep neural networks (DNNs) are used by different applications that are executed on a range of computer architectures, from IoT devices to supercomputers. The footprint of these networks is huge as well as their computational and communication needs. In order to ease the pressure on resources, research indicates that in many cases a low precision representation (1-2 bit per parameter) of weights and other parameters can achieve similar accuracy while requiring less resources. Using quantized values enables the use of FPGAs to run NNs, since FPGAs are well fitted to these primitives; e.g., FPGAs provide efficient support for bitwise operations and can work with arbitrary-precision representation of numbers. This paper presents a new streaming architecture for running QNNs on FPGAs. The proposed architecture scales out better than alternatives, allowing us to take advantage of systems with multiple FPGAs. We also included support for skip connections, that are used in state-of-the art NNs, and shown that our architecture allows to add those connections almost for free. All this allowed us to implement an 18-layer ResNet for 224×224 images classification, achieving 57.5% top-1 accuracy. In addition, we implemented a full-sized quantized AlexNet. In contrast to previous works, we use 2-bit activations instead of 1-bit ones, which improves AlexNet's top-1 accuracy from 41.8% to 51.03% for the ImageNet classification. Both AlexNet and ResNet can handle 1000-class real-time classification on an FPGA. Our implementation of ResNet-18 consumes 5× less power and is 4× slower for ImageNet, when compared to the same NN on the latest Nvidia GPUs. Smaller NNs, that fit a single FPGA, are running faster then on GPUs on small (32×32) inputs, while consuming up to 20× less energy and power.

ACS Style

Chaim Baskin; Natan Liss; Evgenii Zheltonozhskii; Alex M. Bronstein; Avi Mendelson. Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform. 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2018, 162 -169.

AMA Style

Chaim Baskin, Natan Liss, Evgenii Zheltonozhskii, Alex M. Bronstein, Avi Mendelson. Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform. 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 2018; ():162-169.

Chicago/Turabian Style

Chaim Baskin; Natan Liss; Evgenii Zheltonozhskii; Alex M. Bronstein; Avi Mendelson. 2018. "Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform." 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) , no. : 162-169.