LSTM-AE for anomaly detection on multivariate telemetry data

Abdennebi, Anes and Tuncay, Alp and Yılmaz, Cemal and Koyuncu, Anıl and Gungor, Oktay (2023) LSTM-AE for anomaly detection on multivariate telemetry data. In: IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA), Orlando, FL, USA

Full text not available from this repository. (Request a copy)

Abstract

Organizations and companies that collect data generated by sales, transactions, client/server communications, IoT nodes, devices, engines, or any other data generating/exchanging source, need to analyze this data to reveal insights about the running activities on their systems. Since streaming data has multivariate variables bearing dependencies among each other that extend temporally (to previous time steps).Long-Short Term Memory (LSTM) is a variant of the Recurrent Neural Networks capable of learning long-Term dependencies using previous timesteps of sequence-shape data. The LSTM model is a valid option to apply to our data for offline anomaly detection and help foresee future system incidents. Anything that negatively affects the system and the services provided via this system is considered an incident.Moreover, the raw input data might be noisy and improper for the model, leading to misleading predictions. A wiser choice is to use an LSTM Autoencoder (LSTM-AE) specialized for extracting meaningful features of the examined data and looking back several steps to preserve temporal dependencies.In our work, we developed two LSTM-AE models. We evaluated them in an industrial setup at Koçfinans (a finance company operating in Turkey), where they have a distributed system of several nodes running dozens of microservices. The outcome of this study shows that our trained LSTM-AE models succeeded in identifying the atypical behavior of offline data with high accuracies. Furthermore, after deploying the models, we identified the system failing at the exact times for the previous two reported failures. While after deployment, it launched cautions preceding the actual failure by a week, proving efficiency on online data. Our models achieved 99.7% accuracy and 89.1% as F1-score. Moreover, it shows potential in finding the proper LSTM-AE model architecture when time series data with temporal dependency property is fed to the model.
Item Type: Papers in Conference Proceedings
Uncontrolled Keywords: Anomaly Detection; Autoencoder; LSTM; Machine learning; System Incident Prediction
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Computer Science & Eng.
Faculty of Engineering and Natural Sciences
Depositing User: Cemal Yılmaz
Date Deposited: 04 Sep 2023 13:50
Last Modified: 04 Sep 2023 13:50
URI: https://research.sabanciuniv.edu/id/eprint/47740

Actions (login required)

View Item
View Item