Skip to main content

Clinical Validation of EndoQ’s AI Polyp Detection Published in Gut

Clinical Validation of EndoQ’s AI Polyp Detection Published in Gut

The clinical validation study of EndoQ’s AI-based computer-aided detection (CADe) system for colorectal polyp detection has been published in Gut, one of the leading international journals in gastroenterology.

This publication marks a major milestone: our system was evaluated in a prospective multicentre real-time clinical study, not in a retrospective or artificial benchmark setting.


A new validation approach: real-time unblinding

The study introduced an innovative methodology for AI validation in endoscopy: real-time unblinding with a composite reference standard.

A second observer monitored the CADe output during colonoscopy, while the endoscopist remained blinded to the AI detections. When CADe detected a potential lesion missed by the endoscopist, real-time reinspection was performed. This approach allowed assessment of the true additional clinical value of AI beyond human detection.

This differs fundamentally from earlier studies that compared AI and human performance in isolation.

Study design and population

  • Prospective multicentre trial
  • 295 patients
  • 606 polyps detected endoscopically
  • Evaluation against a composite human–AI reference standard

The system incorporates a temporal algorithm combining convolutional neural networks with a recurrent neural network component, allowing it to “remember” previous frames and reduce false positives.

Key results

The results are clear and clinically meaningful:

  • Diagnostic accuracy:

    • Endoscopist: 98.2%
    • CADe: 96.5%
  • Non-inferiority demonstrated (margin 5%, p < 0.001)

  • Miss rate:

    • Endoscopist: 1.8%
    • CADe: 3.5%

Importantly, 11 polyps initially missed by the endoscopist were detected by CADe and confirmed after real-time unblinding.

In a challenging real-life setting with experienced high-ADR endoscopists (>35%), the AI system proved non-inferior in sensitivity.

Why this matters

Validation in medical AI is not about benchmark scores. It is about performance under real clinical conditions.

This study demonstrates:

  • Real-time feasibility at 30 frames per second
  • Robust detection performance in vivo
  • Clinically acceptable false-positive rates
  • Objective methodology to quantify added value over human detection

Publishing these results in a high-impact peer-reviewed journal provides independent scientific validation and strengthens the regulatory foundation for future CE marking activities.