Artificial intelligence in colonoscopy is often judged by one metric: adenoma detection rate (ADR).
But does increased detection actually change clinical outcomes?
Our latest publication in Gut addresses exactly that question: what are the real clinical consequences of computer-aided polyp detection (CADe) in daily practice?
A large prospective multicentre study
This prospective European multicentre trial included:
- 946 patients
- 2141 polyps detected
- 989 adenomas confirmed by histology
- Nine participating centres across Europe
The study used an innovative real-time unblinding design, where a second observer monitored the CADe output while the endoscopist remained blinded. This created a composite reference standard and allowed objective comparison between human and AI detection in real clinical practice.
Sensitivity: AI vs human
When endoscopy was used as the reference standard:
- Endoscopist sensitivity: 96.0%
- CADe sensitivity: 94.6%
- No statistically significant difference
However, when histology was used as the gold standard:
- CADe sensitivity increased to 96.0%
- Endoscopist sensitivity decreased to 94.9%
- This difference was statistically significant (p = 0.03)
In other words: when looking at histologically confirmed polyps, CADe slightly outperformed human detection.
Additional detection: what was found?
CADe detected 86 additional polyps that were initially missed by endoscopists.
Key characteristics of these extra detections:
- 98% were diminutive or small lesions
- 40% were adenomas
- 15% were sessile serrated lesions
- Only one advanced adenoma (>10mm) was detected
This confirms a consistent finding in AI-assisted colonoscopy: the incremental yield mainly concerns small lesions.
Did this change surveillance intervals?
This is where the study becomes clinically relevant.
Among the additional adenomas detected by CADe:
- Surveillance intervals changed in 22 out of 946 patients
- This corresponds to 2.3% of the total study population
So while CADe reduced adenoma miss rate and improved detection metrics, the impact on follow-up strategy was limited.
That is an important and honest conclusion.
Effect on quality metrics
Despite the modest impact on surveillance intervals, quality indicators improved substantially:
- Polyp Detection Rate increased from 46.9% to 69.5%
- Adenoma Detection Rate increased from 37.6% to 48.1%
Interestingly, the increase in ADR was most pronounced among low and moderate detectors, while high-ADR endoscopists showed limited improvement.
This suggests that AI may function as a quality equaliser, reducing variability between operators.
False positives and procedural time
AI assistance is never free.
The system generated:
- 26,541 false positive detections (counted regardless of duration)
- Approximately 1.7 clinically irrelevant false positives per minute of clean withdrawal time
Mean clean withdrawal time increased by 6.6 minutes (+42.6%).
This highlights a critical reality: improved detection comes with workflow implications.
What does this mean?
This study moves beyond the typical “AI improves ADR” narrative.
It shows that:
- CADe sensitivity is comparable to well-performing endoscopists
- Histologically confirmed detection may slightly favour AI
- The additional detection predominantly concerns diminutive lesions
- Surveillance intervals change only in a small minority of patients
- Workflow and inspection time are meaningfully affected
In short: AI improves quality metrics and reduces miss rate, but the downstream clinical consequences are modest in high-performing settings.
Why this matters for EndoQ
For EndoQ, this publication strengthens two pillars:
- Scientific credibility – rigorous, multicentre, real-world validation
- Clinical transparency – realistic assessment of benefits and limitations
AI in medicine must be evaluated not only on technical performance, but on meaningful clinical outcomes.
This study contributes to a more mature, evidence-based understanding of how computer-aided detection fits into modern endoscopy.