March 8, 2025

USCAP

Automated Combined PD-L1 Scoring using Artificial Intelligence on Whole immunohistochemical Slide Image of various carcinoma types

DiaKwant

PD-L1

Abstract

‍

Background

Immunotherapy that activates the immune system has shown remarkable results for many cancer patients. Various PDL1-scores based on PDL1 expression by immunohistochemistry have been established and used by pathologists to identify eligible patients. However, manual scoring is challenging, especially for CPS score, and a high inter-observer variability is noted which could influence the therapeutic decision (Robert ME, Modern Pathol, 2023). Thus, pathologists require objective techniques to standardize the interpretation of PDL1 expression, reducing inter-observer variability. The aim of this study was to assess how artificial intelligence (AI) could reduce this inter observer variability in proposing reproducible and reliable scores.

Methods

We developed a supervised deep neural network trained on annotated histological images from 46 PDL1-immunostained tumor cases (head and neck, cervix, gastric, esophagus, gastroesophageal junction), labeled by senior pathologists. The network quantified immunopositive tumor (TC) and immune cells (IC) to propose a CPS score. Its performance was evaluated on an independent set of 82 samples from various organs and platforms, relabeled by three senior pathologists in a blinded manner. A consensus score was reached for discordant cases. The routine score was also provided. The AI's classification accuracy, using organ-specific cutoffs, was compared to the accuracy of routine scoring.

Results

We considered the following cutoffs: CPS >= 1 or >=5 for gastric, esophagus, and gastroesophageal junction, CPS >=1 for cervix and head and neck. For the >= 5 cutoff, the accuracy of the AI algorithm was 0.96, while the routine score achieved an accuracy of 0.73. For the >= 1 cutoff, the accuracy of the AI algorithm was 0.98, while the routine score achieved an accuracy of 0.80. Both configurations showed a statistically significant improvement in accuracy between the AI algorithm and the routine score. Two Z-tests were used to compare these values, with p-value < 0.001.

Conclusions

This study demonstrates that AI-driven scoring of PDL1 expression offers a more accurate and consistent assistance to routine manual scoring, significantly reducing inter-observer variability. Further validation in real life practice across diverse tumor types will be essential to confirm its utility, in terms of accuracy and thus time saving, to ensure broad clinical adoption.

Authors

Florian Thomas, Benedicte Cormier, Céline Bossard, Claire Magois, Hélène Roussel, Baptiste Gourdin, Valérie Lemerle, Ilham Chokri, Alexandre Collin, Laetitia Lambros, Abdelmajid Dhouibi, Jean-François Jazeron, Frédérique Jossic, Caroline Eymerit-Morin, Nizar Labaied, Aurore Mensah, Yahia Salhi, Jérôme Chetritt

View Publication