|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 187 - Issue 71 |
| Published: January 2026 |
| Authors: Padmashree G., Murali G. Rao |
10.5120/ijca2026925626
|
Padmashree G., Murali G. Rao . Local–Global Feature Fusion Using CNN and Vision Transformer with Ensemble Post-Classification for Diabetic Retinopathy Diagnosis. International Journal of Computer Applications. 187, 71 (January 2026), 15-24. DOI=10.5120/ijca2026925626
@article{ 10.5120/ijca2026925626,
author = { Padmashree G.,Murali G. Rao },
title = { Local–Global Feature Fusion Using CNN and Vision Transformer with Ensemble Post-Classification for Diabetic Retinopathy Diagnosis },
journal = { International Journal of Computer Applications },
year = { 2026 },
volume = { 187 },
number = { 71 },
pages = { 15-24 },
doi = { 10.5120/ijca2026925626 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2026
%A Padmashree G.
%A Murali G. Rao
%T Local–Global Feature Fusion Using CNN and Vision Transformer with Ensemble Post-Classification for Diabetic Retinopathy Diagnosis%T
%J International Journal of Computer Applications
%V 187
%N 71
%P 15-24
%R 10.5120/ijca2026925626
%I Foundation of Computer Science (FCS), NY, USA
Diabetic retinopathy is a leading cause of vision impairment globally, necessitating timely and accurate diagnosis to prevent irreversible damage. This paper proposes a novel hybrid deep learning framework that combines local and global feature representations for robust DR classification from retinal fundus images. Local features are extracted using a convolutional neural network branch that captures fine-grained pathological patterns such as microaneurysms and hemorrhages. Simultaneously, global contextual features are learned through a Vision Transformer, which models long-range dependencies across the retinal image. The extracted features from both branches are fused and passed through a series of dense layers for initial classification. To further enhance generalization and interpretability, features from the Global Average Pooling layer are used to train a Random Forest classifier. The proposed methodology is evaluated on a benchmark DR dataset with five severity classes. Extensive experiments and ablation studies demonstrate the effectiveness of our architecture in capturing both fine-grained and holistic features, leading to improved classification performance. Our results suggest that the fusion of local and global features, combined with ensemble post-classification, can provide a robust and scalable solution for automated DR diagnosis.