|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 187 - Issue 74 |
| Published: January 2026 |
| Authors: Muzeeb Mohammad |
10.5120/ijca2026926263
|
Muzeeb Mohammad . Survey on AI-Based Reliability and Anomaly Detection in Microservices. International Journal of Computer Applications. 187, 74 (January 2026), 56-63. DOI=10.5120/ijca2026926263
@article{ 10.5120/ijca2026926263,
author = { Muzeeb Mohammad },
title = { Survey on AI-Based Reliability and Anomaly Detection in Microservices },
journal = { International Journal of Computer Applications },
year = { 2026 },
volume = { 187 },
number = { 74 },
pages = { 56-63 },
doi = { 10.5120/ijca2026926263 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2026
%A Muzeeb Mohammad
%T Survey on AI-Based Reliability and Anomaly Detection in Microservices%T
%J International Journal of Computer Applications
%V 187
%N 74
%P 56-63
%R 10.5120/ijca2026926263
%I Foundation of Computer Science (FCS), NY, USA
Microservice architectures enable scalable, agile applications, but their complexity introduces significant reliability challenges. Traditional monitoring often struggles to keep pace with the dynamic and distributed nature of microservices, motivating artificial--intelligence (AI)--driven techniques for proactive anomaly detection and fault management. This survey reviews the state of the art in applying AI to reliability engineering and anomaly detection in microservice-based systems. This paper proposes a taxonomy covering (i) the observability signals used by anomaly detectors---metrics, logs and traces; (ii) the modelling techniques employed---from statistical and classical machine learning through deep learning, graph-based methods and large language models; and (iii) the deployment layer at which detection operates---centralized cloud clusters, distributed edge environments and service meshes. This survey analyzes representative systems and frameworks, comparing their strengths, weaknesses, data requirements, evaluation metrics, scalability and interpretability. Common challenges such as the entropy gap in anomaly scoring, scarcity of real-world labelled anomalies, the need for explainable results and compute constraints in distributed environments are highlighted. This survey concludes with open problems and future directions, emphasizing opportunities in multimodal data fusion, federated and edge-based detection, and human-in-the-loop root-cause analysis for the next generation of reliable microservice ecosystems.