AI-Driven Predictive Resource Management for Scalable and Resilient Cloud Infrastructure

Shraddhaben R. Gajjar

Research Article

AI-Driven Predictive Resource Management for Scalable and Resilient Cloud Infrastructure

by Shraddhaben R. Gajjar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 187 - Issue 87

Published: March 2026

Authors: Shraddhaben R. Gajjar

10.5120/ijca2026926515

PDF

Shraddhaben R. Gajjar . AI-Driven Predictive Resource Management for Scalable and Resilient Cloud Infrastructure. International Journal of Computer Applications. 187, 87 (March 2026), 45-49. DOI=10.5120/ijca2026926515

                        @article{ 10.5120/ijca2026926515,
                        author  = { Shraddhaben R. Gajjar },
                        title   = { AI-Driven Predictive Resource Management for Scalable and Resilient Cloud Infrastructure },
                        journal = { International Journal of Computer Applications },
                        year    = { 2026 },
                        volume  = { 187 },
                        number  = { 87 },
                        pages   = { 45-49 },
                        doi     = { 10.5120/ijca2026926515 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2026
                        %A Shraddhaben R. Gajjar
                        %T AI-Driven Predictive Resource Management for Scalable and Resilient Cloud Infrastructure%T 
                        %J International Journal of Computer Applications
                        %V 187
                        %N 87
                        %P 45-49
                        %R 10.5120/ijca2026926515
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

Cloud infrastructure underpins modern healthcare systems, financial platforms, artificial-intelligence services, and public-sector applications. In these environments, infrastructure managers must continuously balance service-level objectives against rising compute cost. Reactive autoscaling remains the dominant operational mechanism in practice, but threshold-based policies expand capacity only after congestion has already appeared, which can produce latency spikes, brief SLA violations, and persistent over-provisioning. This paper presents a practical AI-driven predictive resource management framework for scalable and resilient cloud infrastructure. The framework combines workload forecasting, multivariate anomaly detection, policy-aware decision logic, and cloud-native orchestration to allocate resources before demand peaks occur. A prototype evaluation using representative cyclical and bursty workloads compares static provisioning, reactive autoscaling, and predictive scaling. The predictive approach reduces total compute-hours by 22.2% versus reactive autoscaling and 36.4% versus static provisioning, while improving SLA compliance to 99.1% and increasing average utilization to 76%. The paper also discusses design trade-offs, deployment constraints, and portability across Kubernetes-based environments. The results suggest that predictive resource management can materially improve both resilience and cost efficiency when integrated with disciplined observability and automated control loops.

References

B. Jennings and R. Stadler, “Resource management in clouds: Survey and research challenges,” Journal of Network and Systems Management, vol. 23, no. 3, pp. 567–619, 2015.
T. Lorido-Botran, J. Miguel-Alonso, and J. A. Lozano, “A review of auto-scaling techniques for elastic applications in cloud environments,” Journal of Grid Computing, vol. 12, no. 4, pp. 559–592, 2014.
A. Gandhi, M. Harchol-Balter, R. Das, and C. Lefurgy, “Optimal power allocation in server farms,” ACM SIGMETRICS Performance Evaluation Review, vol. 37, no. 1, pp. 157–168, 2009.
G. Olaoye, “The impact of artificial intelligence on cloud cost optimization and resource management,” SSRN Electronic Journal, 2025.
R. Buyya, R. N. Calheiros, and X. Li, “Autonomic cloud computing: Open challenges and architectural elements,” Cloud Computing and Distributed Systems Laboratory, University of Melbourne, Tech. Rep., 2012.
J. Xu, M. Zhao, J. Fortes, R. Carpenter, and M. S. Yousif, “On the use of fuzzy modeling in virtualized data center management,” in IEEE ICAC, pp. 25–34, 2007.
S. Islam, J. Keung, K. Lee, and A. Liu, “Empirical prediction models for adaptive resource provisioning in the cloud,” Future Generation Computer Systems, vol. 28, no. 1, pp. 155–162, 2012.
N. Roy, A. Dubey, and A. Gokhale, “Efficient autoscaling in the cloud using predictive models for workload forecasting,” in IEEE International Conference on Cloud Computing, pp. 500–507, 2011.
S. Shen, V. van Beek, and A. Iosup, “Statistical characterization of business-critical workloads hosted in cloud datacenters,” in IEEE/ACM CCGrid, pp. 465–474, 2015.
R. N. Calheiros, E. Masoumi, R. Ranjan, and R. Buyya, “Workload prediction using ARIMA model and its impact on cloud applications’ QoS,” IEEE Transactions on Cloud Computing, vol. 3, no. 4, pp. 449–458, 2015.
S. Taylor and B. Letham, “Forecasting at scale,” The American Statistician, vol. 72, no. 1, pp. 37–45, 2018.
H. Hewamalage, C. Bergmeir, and K. Bandara, “Recurrent neural networks for time series forecasting: Current status and future directions,” International Journal of Forecasting, vol. 37, no. 1, pp. 388–427, 2021.
B. Burns, B. Grant, D. Oppenheimer, E. Brewer, and J. Wilkes, “Borg, Omega, and Kubernetes,” ACM Queue, vol. 14, no. 1, pp. 10–29, 2016.
Kubernetes Authors, “Horizontal Pod Autoscaler,” Kubernetes Documentation, 2024.
M. Villamizar et al., “Evaluating the monolithic and the microservice architecture pattern to deploy web applications in the cloud,” in IEEE Computing Conference, pp. 583–590, 2015.
D. Bernstein, “Containers and cloud: From LXC to Docker to Kubernetes,” IEEE Cloud Computing, vol. 1, no. 3, pp. 81–84, 2014.
V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Computing Surveys, vol. 41, no. 3, pp. 1–58, 2009.
F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in IEEE International Conference on Data Mining, pp. 413–422, 2008.
J. An and S. Cho, “Variational autoencoder based anomaly detection using reconstruction probability,” Special Lecture on IE, vol. 2, pp. 1–18, 2015.
M. Ahmed, A. N. Mahmood, and J. Hu, “A survey of network anomaly detection techniques,” Journal of Network and Computer Applications, vol. 60, pp. 19–31, 2016.
Y. Zhang, N. Duffield, V. Paxson, and S. Shenker, “An information-theoretic approach to traffic anomaly detection,” in ACM SIGCOMM, pp. 301–312, 2003.
L. A. Barroso and U. Hölzle, The Datacenter as a Computer. Morgan & Claypool, 2009.
Gartner, “The cost of downtime in enterprise IT systems,” Gartner Research Report, 2023.
Splunk Inc., “The hidden costs of downtime in financial services,” Industry Report, 2024.
Censinet Inc., “Healthcare downtime costs hospitals study,” Industry Report, 2023.
K. Morris, Infrastructure as Code. O’Reilly Media, 2016.
J. Humble and D. Farley, Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. Addison-Wesley, 2011.
L. Bass, I. Weber, and L. Zhu, DevOps: A Software Architect’s Perspective. Addison-Wesley, 2015.
ENISA, “Cloud computing risk assessment,” European Union Agency for Cybersecurity, 2022.
NIST, “Framework for improving critical infrastructure cybersecurity,” NIST Special Publication 800-53, 2023.
U.S. Department of Homeland Security, “Resilience of cloud infrastructure and critical services,” DHS Technical Report, 2024.
D. Yavorovych et al., “PredictKube: An AI-based predictive autoscaler for Kubernetes,” KEDA Project / IEEE Cloud Computing Blog, 2022.
R. Guntupalli, “Predictive cloud resource management using machine learning,” World Journal of Advanced Research and Reviews, vol. 26, no. 2, pp. 880–885, 2025.
ACM Digital Library, “Predictive auto scaling and cost optimization using machine learning,” ACM Cloud Computing, 2024.
P. Barham et al., “XLA: TensorFlow, compiled,” Google Research, 2018.
J. Dean and L. A. Barroso, “The tail at scale,” Communications of the ACM, vol. 56, no. 2, pp. 74–80, 2013.
M. Zaharia et al., “Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling,” in EuroSys, pp. 265–278, 2010.
B. Hindman et al., “Mesos: A platform for fine-grained resource sharing in the data center,” in NSDI, pp. 295–308, 2011.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Predictive autoscaling AIOps workload forecasting Kubernetes anomaly detection cloud optimization