TY - JOUR
T1 - Optimized Deep Learning Classification Model for Intelligent Edge devices
AU - Naveen, Soumyalatha
AU - Kounte, Manjunath R.
N1 - Publisher Copyright:
© 2024 School of Science, DUTH. All rights reserved.
PY - 2024
Y1 - 2024
N2 - Deep learning models enable state-of-the-art accuracy in computer vision applications. However, the deeper, computationally expensive, and densely connected architecture of deep neural networks (DNN) have limitations for deploying the model on resource-constraint embedded IoT devices. We propose an efficient neural network compression framework that performs filter pruning, fine-tuning and 8-bit quantization to reduce computational complexity, inference time, and memory footprint. Furthermore, reducing the bit widths of activation and weights helps design a compact deployment model on resource-limited IoT devices such as smartphones. The proposed system is evaluated extensively on the CIFAR-10 dataset for Resnet34 and VGG16 models. In addition, we examine the efficacy of a larger model. The result shows that pruning followed by quantization compresses the neural network and compared to the baseline model, achieved an accuracy of 78.01% for Resnet34 and 82.34% for Vgg16 after pruning and quantization which is <1% of marginal loss in accuracy compared to the baseline model. Further, 80x unique parameters from the weight matrix of the model are reduced using k-means clustering along with 8-bit quantization. The study demonstrates that the pruning process had a minimal impact on ResNet34's accuracy, while VGG16 maintained its accuracy even after pruning. Both models showed a reduced memory footprint after applying k-means clustering and 8-bit quantization, making them more efficient for inference tasks without sacrificing performance significantly. Applications like Smart Traffic Management and autonomous vehicles involve deploying edge devices with cameras and sensors at intersections and roadsides to monitor and analyze real-time traffic conditions. The proposed optimized model can be employed for efficient object recognition and classification of vehicles, pedestrians, and traffic signs.
AB - Deep learning models enable state-of-the-art accuracy in computer vision applications. However, the deeper, computationally expensive, and densely connected architecture of deep neural networks (DNN) have limitations for deploying the model on resource-constraint embedded IoT devices. We propose an efficient neural network compression framework that performs filter pruning, fine-tuning and 8-bit quantization to reduce computational complexity, inference time, and memory footprint. Furthermore, reducing the bit widths of activation and weights helps design a compact deployment model on resource-limited IoT devices such as smartphones. The proposed system is evaluated extensively on the CIFAR-10 dataset for Resnet34 and VGG16 models. In addition, we examine the efficacy of a larger model. The result shows that pruning followed by quantization compresses the neural network and compared to the baseline model, achieved an accuracy of 78.01% for Resnet34 and 82.34% for Vgg16 after pruning and quantization which is <1% of marginal loss in accuracy compared to the baseline model. Further, 80x unique parameters from the weight matrix of the model are reduced using k-means clustering along with 8-bit quantization. The study demonstrates that the pruning process had a minimal impact on ResNet34's accuracy, while VGG16 maintained its accuracy even after pruning. Both models showed a reduced memory footprint after applying k-means clustering and 8-bit quantization, making them more efficient for inference tasks without sacrificing performance significantly. Applications like Smart Traffic Management and autonomous vehicles involve deploying edge devices with cameras and sensors at intersections and roadsides to monitor and analyze real-time traffic conditions. The proposed optimized model can be employed for efficient object recognition and classification of vehicles, pedestrians, and traffic signs.
UR - https://www.scopus.com/pages/publications/85200370385
UR - https://www.scopus.com/pages/publications/85200370385#tab=citedBy
U2 - 10.25103/jestr.173.11
DO - 10.25103/jestr.173.11
M3 - Article
AN - SCOPUS:85200370385
SN - 1791-9320
VL - 17
SP - 88
EP - 94
JO - Journal of Engineering Science and Technology Review
JF - Journal of Engineering Science and Technology Review
IS - 3
ER -