Breast cancer is the most common cancer among women with an estimated 2 million new cases and 627,000 deaths globally in 2020. Early diagnosis and appropriate treatment are key factors in improving patient outcomes and survival. Several prognostic biomarkers – clinical, pathological and molecular – are used in routine to identify the risk of disease progression as well as the response to targeted therapy, and thus guide treatment decisions. However, some limitations remain, and new tools are needed to improve the identification of patients with high risk of disease progression and death from patients with low risk, and thus precise the most appropriate therapeutic strategy. In this context, Artificial Intelligence (AI) applied to digitized histopathological images could be a powerful tool to predict disease evolution, as AI may be able to detect morphological features that are not immediately apparent to the naked eye under the microscope. In this study, we aimed to develop a deep-learning algorithm able to predict the 10 years overall survival of patients with breast carcinomas, only based on analysis of H&E stained WSI of breast tumors used for the diagnosis.
We introduce a deep neural network (DNN) specifically designed to calculate a survival risk score for breast carcinomas patients directly from whole slide images (WSI) of HE-stained sections of tumor, without any annotations. The model integrates two distinct morphological informations: one encapsulates cellular-level informations, while the other encompasses tissue-level ones, using the Cox proportional hazard loss function. The model pipeline is presented on the Figure 2. The algorithm has been trained and evaluated on the publicly available TCGA-BRCA dataset. Cases without date of diagnosis, WSI or with marker-pen artifacts were excluded from this study, resulting in a total of 1,003 patients. Among these cases, there were 141 recorded deaths. We used a 5-fold stratified framework and in order to enhance both the assessment of our pipeline and the model's robustness, we introduced a cross-testing and cross-validation technique. Concordance index (c-index) was used as a metric to assess the performance of the proposed algorithm. The model outcomes were subsequently integrated into a Cox model alongside clinical features such as age and TNM stage, aiming to evaluate whether our model can serve as a crucial independent prognostic factor in predicting survival. Furthermore, we developed a heat-map on WSI showing Region Of Interest (ROI) captured by our model to provide the AI-based risk-score.
Our model demonstrated an average concordance index of 0.682 when assessed on the testing set. The Cox model, incorporating clinical features (age and TNM stage), produced an average c-index of 0.767. Notably, this value rose to 0.786 upon the inclusion of our AI-based risk score. These results indicate that the proposed model not only matches but surpasses of existing models when it comes to predicting survival, thanks to the robust cross-testing and cross-validation technique. The Cox model indicates that our AI-based risk score can be used as an independent prognostic factor for predicting overall survival (p<0.005). Furthermore, we were able to significantly discriminate 2 groups of patients in terms of survival outcome, depending on a AI-based high or low risk. A Kaplan-Meier survival curve is provided in Figure 4, combining results from each of the five testing sets, and illustrating the stratified population in 2 distinct groups with a median OS of 8.5 years in the high risk group, compared with a non reach median OS in the low-risk group (log-rank test p-value of 9.02e-10). Our findings demonstrate significant potential in forecasting a risk score for breast cancer patients. Finally, an example of a heat map of a WSI is depicted in Figure 5. It clearly show us some ROI in the tumor tissue with important prognostic value for our model. This approach aims to provide on onset of interpretability.
Interestingly, our developed algorithm can automatically extract prognostic morphological features from HE WSI, to predict an AI-based risk score demonstrating a significant prognostic influence in terms of overall survival for breast cancer patients. These results are important as they could aid clinical decision making and improve the quality of care of patients. This emerging and disruptive prognostic approach represents a new concept in the field of precision oncology and personalized medicine. Further studies based on other independent cohorts are required to validate the performance of the algorithm, demonstrate its superiority over the current prognostic markers, as well as to offer further explainability. Indeed, deciphering and understanding the prognostic ROI on tumor tissues will add somes important informations in terms of interpretability of our models, and thus confidence.
Joseph Rynkiewicz, Julian Paul, Yahia Salhi, Celine Bossard, Sanae Salhi, Jerome Chetritt