How to improve a 5-class Diabetic Retinopathy model (APTOS 2019) – Mixed predictions across classes[P]
Hi everyone,
I'm a final-year Computer Engineering student building a Flask-based AI Diabetic Retinopathy Detection system. The web application itself is complete with patient management, authentication, dashboard, PDF report generation, prediction history, and AI inference.
The only issue I'm facing is with the AI model.
I'm using a 5-class Diabetic Retinopathy classifier trained on the APTOS 2019 dataset.
Classes:
No DR
Mild
Moderate
Severe
Proliferative DR
The model predicts all five classes, but the predictions are inconsistent.
Examples:
Moderate is sometimes classified as Severe or Proliferative.
Severe is often classified as Moderate or Proliferative and is rarely predicted correctly.
Some fundus images from outside the APTOS dataset produce completely unexpected results.
The model sometimes shows very high confidence (90%+) even when the prediction appears incorrect.
Things I've already tried:
Different pretrained models (including a ResNet50 trained on APTOS)
ResNet152 implementation
Correct preprocessing (RGB conversion, resizing, normalization)
Verified class mapping
Softmax confidence scores
Test-Time Augmentation (TTA)
Image quality validation