Predictive Maintenance Using Deep Reinforcement Learning in Cloud Infrastructure Management

Main Article Content

Sathish Kaniganahalli Ramareddy

Abstract

The exponential growth of cloud computing infrastructures has intensified the need for intelligent predictive maintenance to ensure reliability and minimize downtime. This paper presents a novel Deep Reinforcement Learning (DRL)-based framework for predictive maintenance in cloud infrastructure management. The proposed system integrates telemetry-based data acquisition, deep policy learning, and orchestration-driven action execution to create an adaptive, self-healing maintenance ecosystem. Using telemetry data from simulated multi-node cloud environments, the DRL agent learns optimal maintenance policies that minimize failure risk while reducing operational costs. Comparative analysis against traditional models—Random Forest, LSTM, and Q-Learning—demonstrates the superior performance of the DRL approach, achieving 96.3% fault prediction accuracy, 42.1% downtime reduction, and 39.5% maintenance cost savings. The framework’s closed-loop architecture enables continuous learning and dynamic optimization, ensuring proactive fault mitigation and resource efficiency. Results highlight the framework’s scalability, adaptability, and real-time decision-making capability, confirming its potential to revolutionize predictive maintenance in cloud systems. Future work will extend the model to multi-agent and federated settings for distributed predictive intelligence in hybrid cloud environments.

Article Details

How to Cite
Ramareddy, S. K. (2024). Predictive Maintenance Using Deep Reinforcement Learning in Cloud Infrastructure Management. International Journal on Advanced Computer Theory and Engineering, 13(1), 21–30. https://doi.org/10.65521/ijacte.v13i1.878
Section
Articles

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.