TY - JOUR
T1 - Low Latency Deep Learning Inference Model for Distributed Intelligent IoT Edge Clusters
AU - Naveen, Soumyalatha
AU - Kounte, Manjunath R.
AU - Ahmed, Mohammed Riyaz
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - Edge computing is a new paradigm enabling intelligent applications for the Internet of Things (IoT) using mobile, low-cost IoT devices embedded with data analytics. Due to the resource limitations of Internet of Things devices, it is essential to use these resources optimally. Therefore, intelligence needs to be applied through an efficient deep learning model to optimize resources like memory, power, and computational ability. In addition, intelligent edge computing is essential for real-time applications requiring end-to-end delay or response time within a few seconds. We propose decentralized heterogeneous edge clusters deployed with an optimized pre-trained yolov2 model. In our model, the weights have been pruned and then split into fused layers and distributed to edge devices for processing. Later the gateway device merges the partial results from each edge device to obtain the processed output. We deploy a convolutional neural network (CNN) on resource-constraint IoT devices to make them intelligent and realistic. Evaluation was done by deploying the proposed model on five IoT edge devices and a gateway device enabled with hardware accelerator. The evaluation of our proposed model shows significant improvement in terms of communication size and inference latency. Compared to DeepThings for $5\times 5$ fused layer partitioning for five devices, our proposed model reduces communication size by 14.4% and inference latency by 16%.
AB - Edge computing is a new paradigm enabling intelligent applications for the Internet of Things (IoT) using mobile, low-cost IoT devices embedded with data analytics. Due to the resource limitations of Internet of Things devices, it is essential to use these resources optimally. Therefore, intelligence needs to be applied through an efficient deep learning model to optimize resources like memory, power, and computational ability. In addition, intelligent edge computing is essential for real-time applications requiring end-to-end delay or response time within a few seconds. We propose decentralized heterogeneous edge clusters deployed with an optimized pre-trained yolov2 model. In our model, the weights have been pruned and then split into fused layers and distributed to edge devices for processing. Later the gateway device merges the partial results from each edge device to obtain the processed output. We deploy a convolutional neural network (CNN) on resource-constraint IoT devices to make them intelligent and realistic. Evaluation was done by deploying the proposed model on five IoT edge devices and a gateway device enabled with hardware accelerator. The evaluation of our proposed model shows significant improvement in terms of communication size and inference latency. Compared to DeepThings for $5\times 5$ fused layer partitioning for five devices, our proposed model reduces communication size by 14.4% and inference latency by 16%.
UR - https://www.scopus.com/pages/publications/85120579352
UR - https://www.scopus.com/inward/citedby.url?scp=85120579352&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3131396
DO - 10.1109/ACCESS.2021.3131396
M3 - Article
AN - SCOPUS:85120579352
SN - 2169-3536
VL - 9
SP - 160607
EP - 160621
JO - IEEE Access
JF - IEEE Access
ER -