|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 187 - Issue 98 |
| Published: April 2026 |
| Authors: Rama Krishna Reddy Arumalla |
10.5120/ijcadab8ea8eb453
|
Rama Krishna Reddy Arumalla . AI-Assisted Incident Detection and Automated Recovery in Distributed E-Commerce Systems. International Journal of Computer Applications. 187, 98 (April 2026), 6-11. DOI=10.5120/ijcadab8ea8eb453
@article{ 10.5120/ijcadab8ea8eb453,
author = { Rama Krishna Reddy Arumalla },
title = { AI-Assisted Incident Detection and Automated Recovery in Distributed E-Commerce Systems },
journal = { International Journal of Computer Applications },
year = { 2026 },
volume = { 187 },
number = { 98 },
pages = { 6-11 },
doi = { 10.5120/ijcadab8ea8eb453 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2026
%A Rama Krishna Reddy Arumalla
%T AI-Assisted Incident Detection and Automated Recovery in Distributed E-Commerce Systems%T
%J International Journal of Computer Applications
%V 187
%N 98
%P 6-11
%R 10.5120/ijcadab8ea8eb453
%I Foundation of Computer Science (FCS), NY, USA
Distributed e-commerce systems now face unprecedented issues of uptime and performance because of the complexity of microservices systems. The intended study suggests an Intelligent Observability and Incident Response Framework that would actively detect bottlenecks and automate the recovery processes. The research paper is based on a filtered dataset of 452 working telemetry examples, including such measures as request latency, CPU utilization, memory pressure, and error rates recorded during the peak traffic scenarios. The framework takes advantage of a pile of open-source monitoring agents, time-series databases, and automated orchestration engines to shift it away to predictive observability. The findings show the Mean Time to Detect and Mean Time to Repair are reduced significantly. These results indicate that machine learning can be used in conjunction with conventional telemetry to identify silent failures not detected by conventional threshold-based alerts. The paper describes the architecture design, the implementation of the smart layer, and an overall discussion of the system performance at different load states, which can be applied to the blueprint of a resilient digital commerce infrastructure.