Machine Learning and Statistical Techniques for Outlier Detection in Smart Home Energy Consumption

  • N. Sri Krishna
  • , Y. V.Pavan Kumar
  • , K. Purna Prakash*
  • , G. Pradeep Reddy
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Due to the continuous increase of smart home culture worldwide, large volumes of energy consumption data gained the attention of data scientists. Smart meters capture the energy consumption readings at a predefined rate and store them as a database. The quality of these databases is highly desired to have accurate analysis and decision-making. But, these readings often have anomalies namely missingness, redundancy, and outliers due to the issues present in meter/data communication networks. Among these, outlier readings indicate an abnormality of the load behavior (e.g.: nonlinearity, unpredicted load switching, system faults, etc.). Hence, it is essential to detect and visualize such anomalies for the necessary treatment. With this motivation, this paper implements various key machine learning and statistical techniques namely autoregressive integrated moving average (ARIMA), autoencoder, density-based spatial clustering of applications with noise (DBSCAN), isolation forest, k-means, hierarchical density-based spatial clustering of applications with noise (HDBSCAN), one-class support vector machine (SVM), local outlier factor (LOF), long short-term memory (LSTM), winsorization, interquartile range (IQR), and Z-score. The results revealed that DBSCAN consistently demonstrated the most accurate performance in detecting outliers in energy data, while, Z-score, IQR, and winsorization provided reasonable outcomes but were limited in handling complex and non-linear data patterns. Autoencoder, Isolation forest, and One-class SVM showed moderate success, but their performance depended on the specific dataset characteristics. Kmeans exhibited mixed results. ARIMA, LOF, LSTM, and HDBSCAN had limited success in outlier detection in the timeseries data. Thus, this analysis finally recommends DBSCAN as the best technique as it consistently outperformed other machine learning and statistical techniques in accurately detecting outliers in smart home energy consumption data.

Original languageEnglish
Title of host publication2024 IEEE Open Conference of Electrical, Electronic and Information Sciences, eStream 2024 - Proceedings
EditorsDalius Navakauskas, Sarunas Paulikas, Tomyslav Sledevic, Dainius Udris
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350352412
DOIs
Publication statusPublished - 2024
Event11th IEEE Open Conference of Electrical, Electronic and Information Sciences, eStream 2024 - Vilnius, Lithuania
Duration: 25-04-2024 → …

Publication series

Name2024 IEEE Open Conference of Electrical, Electronic and Information Sciences, eStream 2024 - Proceedings

Conference

Conference11th IEEE Open Conference of Electrical, Electronic and Information Sciences, eStream 2024
Country/TerritoryLithuania
CityVilnius
Period25-04-24 → …

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Hardware and Architecture
  • Electrical and Electronic Engineering
  • Instrumentation
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Information Systems
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Machine Learning and Statistical Techniques for Outlier Detection in Smart Home Energy Consumption'. Together they form a unique fingerprint.

Cite this