Immunohistochemistry as a reliable predictor of remission in patients with endometrial cancer: Establishment and validation of a machine learning model

Wang,Ruiqi; Wang,Jingyuan; Wu,Yuman; Zhu,Aoxuan; Li,Xingchen; Wang,Jianliu

doi:10.3892/ol.2024.14805

January-2025 Volume 29 Issue 1

Full Size Image

Journals

International Journal of Molecular Medicine

International Journal of Molecular Medicine is an international journal devoted to molecular mechanisms of human disease.

International Journal of Oncology

International Journal of Oncology is an international journal devoted to oncology research and cancer treatment.

Molecular Medicine Reports

Covers molecular medicine topics such as pharmacology, pathology, genetics, neuroscience, infectious diseases, molecular cardiology, and molecular surgery.

Oncology Reports

Oncology Reports is an international journal devoted to fundamental and applied research in Oncology.

Experimental and Therapeutic Medicine

Experimental and Therapeutic Medicine is an international journal devoted to laboratory and clinical medicine.

Oncology Letters

Oncology Letters is an international journal devoted to Experimental and Clinical Oncology.

Biomedical Reports

Explores a wide range of biological and medical fields, including pharmacology, genetics, microbiology, neuroscience, and molecular cardiology.

Molecular and Clinical Oncology

International journal addressing all aspects of oncology research, from tumorigenesis and oncogenes to chemotherapy and metastasis.

World Academy of Sciences Journal

Multidisciplinary open-access journal spanning biochemistry, genetics, neuroscience, environmental health, and synthetic biology.

International Journal of Functional Nutrition

Open-access journal combining biochemistry, pharmacology, immunology, and genetics to advance health through functional nutrition.

International Journal of Epigenetics

Publishes open-access research on using epigenetics to advance understanding and treatment of human disease.

Medicine International

An International Open Access Journal Devoted to General Medicine.

January-2025 Volume 29 Issue 1

Full Size Image

Article Open Access

Immunohistochemistry as a reliable predictor of remission in patients with endometrial cancer: Establishment and validation of a machine learning model

Authors:
- Ruiqi Wang
- Jingyuan Wang
- Yuman Wu
- Aoxuan Zhu
- Xingchen Li
- Jianliu Wang
View Affiliations / Copyright

Affiliations: Department of Obstetrics and Gynecology, Peking University People's Hospital, Beijing 100044, P.R. China

Copyright: © Wang et al. This is an open access article distributed under the terms of Creative Commons Attribution License.
Article Number: 59
|
Published online on: November 15, 2024

https://doi.org/10.3892/ol.2024.14805
Expand metrics +

Abstract

Endometrial cancer (EC) is the most common gynecologic cancer. Unfortunately, its prognosis remains poor due to limited screening and treatment options. To address this issue, the present study evaluated the predictive value of four immunohistochemical (IHC) indicators for overall survival (OS) and recurrence‑free survival (RFS) in patients with EC. A total of 834 patients diagnosed with EC were included at Peking University People's Hospital between January 2006 and December 2020. These patients were randomly divided into training and validation cohorts at a 2:1 ratio, collecting data on clinicopathological information and IHC indicators. A total of 92 combinations of algorithms were assessed using the Leave‑One‑Out Cross‑Validation framework to identify the one with the highest C‑index. To estimate the accuracy of the factors and four IHC indicators for predicting both OS and RFS, survival curves and receiver operating characteristic (ROC) curves were used. Independent predictors included estrogen receptor, progesterone receptor, body mass index, P53, FIGO stage, histology, grade, Ki67, ascites and lymph node metastasis. Both the training and validation cohorts exhibited excellent predictive performance for OS and RFS, as demonstrated by ROC curves at 1‑year, 3‑year and 5‑year follow‑ups. By introducing a model based solely on clinicopathological information as model 1 and adding four IHC indicators in model 2, a significant improvement was observed in the area under the curve (AUC) values across the entire sample. The AUC value for OS curves increased from 0.765 to 0.872, and the AUC for RFS curves rose from 0.791 to 0.882. Thus, the present study's model effectively predicts patients' probability of OS and RFS using these factors. This prediction capability can guide postoperative treatment plans and follow‑up intervals, potentially enhancing long‑term survival for patients with EC.

Introduction

Endometrial cancer (EC) is the most common gynecological cancer in the United States, with a troubling rise in related fatalities (1). This trend is also evident in developing countries, where both incidence and mortality rates are increasing (2,3). According to global cancer statistics published by CA-Cancer Journal for Clinicians in 2021, China reported 80,000 new cases of EC in 2020 (4). Despite generally favorable overall prognosis, mortality rates for EC are on the rise. By 2035, EC is projected to become the sixth leading cause of cancer-related deaths among women (5). Therefore, advancing early diagnostic and prognostic evaluation techniques for EC is crucial, as these improvements are key to enhancing survival rates for those affected by the disease.

In recent years, the classification and treatment of Patients with EC have become increasingly precise, representing a major shift from tissue-based to gene-based approaches (6). Following the introduction of molecular subtypes of EC by The Cancer Genome Atlas in 2013 (7), over a decade of research has confirmed the predictive efficacy of these subtypes. These four molecular classifications were incorporated into the guidelines by the ESMO-ESGO-ESTRO consensus conference in 2016 and were officially included in the FIGO staging criteria in 2023, promoting molecular subtyping for all patients with EC. However, the present classification standards have notable limitations: i) Existing predictive factors are inadequate for fully assessing the risk of recurrence, especially in the early stages (8); and ii) routine molecular profiling is costly and numerous patients achieve favorable outcomes with hysterectomy alone, suggesting that a more cost-effective approach may be preferable.

The critical role of immunohistochemistry (IHC) in risk stratification for patients with EC is well-documented, demonstrating its practical application and high reproducibility (9). Despite advances in algorithm development, a need remains for a cost-effective and highly useful predictive model to assess recurrence risk. In the present study, basic clinical information and preoperative routine pathological IHC results were used, including estrogen receptor (ER), progesterone receptor (PR), P53, Ki67, lymph node metastasis (LNM), lymph-vascular space invasion (LVSI) and other indicators, to construct a predictive model. Survival outcomes associated with various histological behaviors in previous patients were analyzed, aiming to provide a more specific and sensitive model for patients with EC.

Materials and methods

Patient population

A retrospective study was conducted of patients diagnosed with EC at the Department of Obstetrics and Gynecology, Peking University People's Hospital (PKUPH), from January 2006 to December 2020. The inclusion criteria were: i) Age over 18 years; ii) histologically confirmed diagnosis of EC; iii) undergoing total hysterectomy with either systematic lymphadenectomy or sentinel lymph node dissection (10); iv) complete clinical information and postoperative pathological data. The exclusion criteria were: i) Presence of additional malignant tumors; ii) lack of medical records; iii) preoperative treatment history; iv) other serious illnesses (such as stroke and heart disease); and v) death from other causes during follow-up. Based on these criteria, a total of 834 cases were selected for subsequent analysis. The present study was approved (approval no. 2022PHB379) by the Ethics Committee Board of Peking University People's Hospital (Beijing, China), in accordance with the principles outlined in the Declaration of Helsinki (2013). Informed consent was obtained from all subjects.

IHC

All patients underwent IHC examination, with approval from the Institutional Review Board of PKUPH for tissue excision. Pathological surgical specimens were fixed in 4% paraformaldehyde at 25°C for 48 h. After dehydration in a gradient of ethanol and clarification in xylene, the tissue samples were infiltrated with paraffin and embedded. The embedded tissue blocks were then sectioned into 5 µm slices using a microtome. The sections were incubated in a 60–65°C oven for 1–1.5 h, then deparaffinized in a xylene and ethanol gradient. Antigen retrieval was performed by incubating the sections in sodium citrate buffer at 95°C for 10 min, followed by the addition of an endogenous peroxidase blocker (cat. no. BF06060; Biodragon) to the tissue. The sections were then washed with PBST (including 0.1% Tween-20) for 3 min × 3 times. The primary antibody was applied to the tissue and incubated overnight at 4°C, followed by the addition of the secondary antibody and incubation at room temperature for 30 min. Detailed information about the antibodies has been added to the supplementary materials (Table SI). The sections' color was developed with 3,3′-diaminobenzidine (DAB), and the nuclei were stained with hematoxylin at 25°C for 15 min. Two pathologists independently assessed each sample in a blinded manner, without prior knowledge of the patients' details. IHC staining for estrogen receptor (ER), PR and P53 included both the percentage of positive nuclear staining, from 0–100%, and staining intensity, which was graded on a scale from 0 to 3. On this scale, 0 indicated negative, 1 indicated weak staining (+), 2 indicated moderate staining (++), and 3 indicated strong staining (+++). Ki67 was evaluated based solely on the percentage of positive nuclear staining. The representative IHC images are attached (Fig. S1, Fig. S2, Fig. S3, Fig. S4). In summary, the expression patterns of these four IHC markers in the patients were derived from the pathology reports and were reviewed and confirmed by two experienced pathologists.

Construction of prognostic model

To develop a model with high accuracy and stability, 10 machine learning algorithms were integrated and 92 algorithm combinations. The algorithms included RSF, elastic net, least absolute shrinkage and selection operator (Lasso), Ridge, StepCox, CoxBoost, partial least squares regression for Cox, supervised principal components, generalized boosted regression modeling, and survival support vector machine. One algorithm to filter the variables was utilized and another to build the prognostic signature. Out of 100 possible combinations of machine learning algorithm pairs, eight were excluded because the final prognostic signature included fewer than five genes. Leave-One-Out Cross-Validation (LOOCV) is well-known for providing an unbiased estimate and allowing comprehensive testing on each data point, ensuring the accuracy of the predictive model. The principle of this algorithm is as follows: One observation is selected as the test data, while all remaining observations are used as the training data. The model is then trained, and this process is repeated for each observation in the dataset. The test error is estimated by averaging the errors across all iterations.

The procedure for generating the signature was as follows: i) The collected patient demographic and pathological staining data were organized into numerical variables [age at diagnosis, BMI, ER percentage, PR percentage, P53 percentage, Ki67 percentage, overall survival (OS) time] and categorical variables [menopause status (premenopausal, postmenopausal), diabetes mellitus (without, with), hypertension (without, with), number of ER+ (0, 1, 2, 3), number of PR+ (0, 1, 2, 3), number of P53+ (0, 1, 2, 3), survival status (alive, deceased), ascites' cytology (negative, positive), histology [endometrioid endometrial adenocarcinoma (EEA), other types], LNM (negative, positive), lymph-vascular space invasion (negative, positive), myometrial invasion (<50%, ≥50%), cervical invasion (negative, positive), FIGO stage (I, II, III, IV) and grade (G1, G2, G3)]; ii) identifying those factors highly associated with prognosis through univariate Cox regression; iii) as previously mentioned, the combination of the 92 algorithms were utilized to construct predictive models for patients with EC; and iv) the Harrell concordance index (C-index) was computed, with the model exhibiting the highest average C-index being selected as the final model.

Statistical analysis

The χ2 test was applied to compare categorical variables and the Wilcoxon rank-sum test or the unpaired t-test were used to assess continuous variables. Fisher's exact test was employed for the analysis of sample data with theoretical frequencies <5. The correlation between two continuous variables was evaluated using the Pearson correlation coefficient. The optimal cut-off value was determined with the survminer package. C-indices were compared using the Compare C package. Cox regression and Kaplan-Meier analyses followed by the log-rank test were conducted with the survival package. ROC analysis was performed with the pROC package, and the area under the curve (AUC) for survival variables was assessed using the time ROC package. All data analyses were conducted with R version 4.3.2 (http://www.R-project.org; The R Foundation) and EmpowerStats (http://www.empowerstats.com; X&Y Solutions, Inc.). A two-tailed significance level of P<0.05 was considered to indicate a statistically significant difference.

Results

Clinical and pathological feature

In the present risk model, a total of 834 patients with EC were randomly assigned to two groups: The training cohort (n=566) and the validation cohort (n=278), in a 2:1 ratio. The clinical baseline features and clinicopathological characteristics of patients are presented in Tables I and II. Based on the P-values obtained from the unpaired t-test and Fisher's exact test, the differences between the two groups were found to be statistically non-significant. Both cohorts predominantly consist of middle-aged and overweight patients, with mean ages of 56.49 and 55.85 years, and mean body mass indexes of 26.35 and 26.18 in the training and validation cohorts, respectively. In the training cohort, 364 patients (65.47%) are postmenopausal, while 192 patients (34.53%) are not. By contrast, the validation cohort includes 101 patients with premenopausal (36.33%) and 177 patients with non-premenopausal (63.67%). Most patients are staged as FIGO Stage I, comprising 77.34% (430/556) of the training cohort and 85.25% (237/278) of the validation cohort.

Table I.

Baseline Information and Clinical Features of EC Patients - Continuous Variables

Table II.

Baseline Information and Clinical Features of EC Patients - Categorical Variables

Establishment of machine-learning model for pathology prediction

The present study analyzed 19 characteristic factors of patients with EC. Except for menopausal status, diabetes and hypertension, univariate Cox analysis revealed that the impact of the remaining factors on OS was statistically significant (Table III). Additionally, ROC curves were plotted for models incorporating four factors, three factors, and two factors, respectively (Figs. S5 and S6), demonstrating that the model including four IHC factors had the best predictive performance (AUC=0.951). A machine learning-based pathology-related model incorporating these 16 selected factors was developed.

Table III.

The univariate COX analysis of OS and RFS.

Table III.

The univariate COX analysis of OS and RFS.

Variables	OS	RFS
Age at diagnosis	1.07 (1.02,1.11) 0.0021	1.05 (1.02,1.09) 0.0047
Body mass index (kg/m²)	1.83 (1.05,3.60) 0.0458	2.35 (1.05,5.27) 0.0371
ER percentage	0.17 (0.06,0.48) 0.0009	0.20 (0.08,0.49) 0.0005
PR percentage	0.10 (0.04,0.28) <0.0001	0.11 (0.05,0.27) <0.0001
P53 percentage	6.18 (2.37,16.10) 0.0002	5.63 (2.56,12.40) <0.0001
Ki67 percentage	4.69 (2.02,10.88) 0.0003	2.92 (1.59,5.36) 0.0005
Menopause status
Premenopausal	1.0	1.0
Postmenopausal	2.29 (0.85,6.13) 0.0995	2.99 (1.22,7.37) 0.0171
Diabetes mellitus
Without	1.0	1.0
With	0.49 (0.15,1.65) 0.2501	0.67 (0.27,1.67) 0.3922
Hypertension
Without	1.0	1.0
With	1.26 (0.56,2.80) 0.5787	0.97 (0.48,1.95) 0.9212
Number of ER⁺
0	1.0	1.0
1	0.20 (0.08,0.52) 0.0009	0.15 (0.06,0.35) <0.0001
2	0.05 (0.01,0.38) 0.0042	0.06 (0.01,0.31) 0.0007
3	0.15 (0.03,0.72) 0.0183	0.14 (0.03,0.55) 0.0050
Number of PR⁺
0	1.0	1.0
1	0.15 (0.06,0.34) <0.0001	0.14 (0.07,0.31) <0.0001
2	0.08 (0.01,0.64) 0.0171	0.07 (0.01,0.57) 0.0131
3	0.06 (0.01,0.49) 0.0082	0.08 (0.02,0.38) 0.0014
Number of P53
0	1.0	1.0
1	8.81 (2.05, 37.84) 0.0034	5.26 (1.96, 14.10) 0.0010
2	19.76 (2.78, 140.50) 0.0029	9.15 (1.53, 54.59) 0.0151
3	9.40 (0.85, 103.80) 0.0673	10.46 (1.72, 63.59) 0.0108
Ascites' cytology
Negative	1.0	1.0
Positive	10.08 (4.08, 24.90) <0.0001	11.65 (4.72, 28.74) <0.0001
Histology
EEA	1.0	1.0
Other types	11.45 (5.08, 25.84) <0.0001	12.03 (5.73, 25.26) <0.0001
Lymph node metastasis
Negative	1.0	1.0
Positive	21.02 (8.96, 49.30) <0.0001	14.44 (6.78, 30.76) <0.0001
Lymph-vascular space invasion
Negative	1.0	1.0
Positive	8.66 (3.84, 19.53) <0.0001	3.93 (1.90, 8.15) 0.0002
Myometrial infiltration
<50%	1.0	1.0
≥50%	15.33 (4.57, 51.42) <0.0001	7.08 (3.22, 15.55) <0.0001
FIGO stage
I	1.0	1.0
II	7.96 (1.46, 43.47) 0.0167	5.61 (1.42, 22.07) 0.0137
III	17.00 (5.23, 55.22) <0.0001	10.14 (4.13, 24.90) <0.0001
IV	149.67 (43.33, 517.00) <0.0001	71.00 (20.46, 246.32) <0.0001
Grade
G1	1.0	1.0
G2	0.47 (0.08, 2.79) 0.4025	2.32 (0.46, 11.64) 0.3079
G3	10.22 (3.02, 34.54) 0.0002	25.46 (5.90, 109.86) <0.0001

[i] ER, estrogen receptor; PR, progesterone receptor; OS, overall survival; RFS, recurrence-free survival.

In the EC dataset, 92 prediction models were applied using the LOOCV framework and the C-index for each model was calculated (Fig. 1A). The Lasso and stepwise Cox models were selected, which revealed the highest average C-index of 0.923. In Lasso regression, the optimal λ value was identified by minimizing the partial likelihood deviance using the LOOCV framework (Fig. 1B). Through stepwise Cox proportional hazards regression, a final set of 11 factors were determined from the original 16 factors (Fig. 1C). A risk score was calculated for each patient using the regression coefficients (Fig. 1D). The median risk score was used in each cohort as the threshold to stratify patients (Fig. 1E). As risk scores increased, survival time decreased, and the mortality rate increased (Fig. 1F). By combining multiple machine learning algorithms, the accuracy of the present study's predictive model has been significantly improved.

Figure 1.

Pathological prediction model developed and validated via the machine learning-based integrative procedure. (A) A total of 92 types of prediction models investigated via the Leave-One-Out Cross-Validation framework. The C-index of each model was calculated across all validation datasets. (B and C) In the training cohort (n=556), the determination of the optimal λ was based on when the partial likelihood deviance reached the minimum value, which further generated Lasso coefficients of the most useful prognostic features. (D) Coefficients of the 11 factors obtained in stepwise Cox regression. (E) Patients were stratified into high-risk and low-risk groups based on their risk scores. (F) Patients in high-risk groups had significantly higher numbers of deaths during the follow-up period. AUC, area under curve; ER, estrogen receptor; PR, progesterone receptor; BMI, body mass index; LNM, lymph node metastasis.

Evaluation of the pathological prediction model in OS

Kaplan-Meier plots and ROC curves were used to evaluate the relationship between risk scores and prognosis in patients with EC. The model demonstrated superior accuracy according to ROC analysis. In the training cohort, the AUC for predicting OS at 1, 3 and 5 years was 0.918, 0.893 and 0.853, respectively (Fig. 2A). In the validation cohort, the AUCs were 0.995, 0.757 and 0.719, respectively (Fig. 2C). Furthermore, the OS rate for the high-risk group was significantly lower compared with the low-risk group, with P=5.615×10−7 (Fig. 2B) for the training cohort and P=1.374×10−3 (Fig. 2D) for the validation cohort. The present study's model clearly assessed patient risk severity effectively and demonstrated strong predictive capability for recent events.

Figure 2.

Combination of the stratification and the aforementioned model. The OS ROC curves and Kaplan-Meier survival curves were plotted for both the training and validation cohorts of patients with endometrial cancer. (A and B) OS ROC curves and Kaplan-Meier survival curves of the training cohorts. (C and D) OS ROC curves and Kaplan-Meier survival curves of the validation cohorts. OS, overall survival; ROC; receiver operating characteristic; AUC, area under curve.

Application of this model in recurrence-free survival (RFS)

A similar approach was used to develop a prognostic model for RFS in patients with EC. The integration of Lasso and stepwise Cox methods demonstrated superior statistical power, achieving a C-index of 0.896 for the training group (Fig. 3A). After filtering out 11 factors (Fig. 3B and C), patients were categorized into high-risk and low-risk groups using the new risk calculation formula (Fig. 3D and E). Patients in the high-risk group exhibited a shorter time to recurrence, as demonstrated by a denser concentration of red dots in the lower right corner (Fig. 3F).

Figure 3.

Pathological prediction model developed and validated via the machine learning-based integrative procedure. (A) A total of 92 types of prediction models were investigated via the Leave-One-Out Cross-Validation framework. The C-index of each model was further calculated across all validation datasets. (B and C) In the training cohort (n=556), the determination of the optimal λ was based on when the partial likelihood deviance reached the minimum value, which further generated Lasso coefficients of the most useful prognostic genes. (D) Coefficients of the 11 factors obtained in stepwise Cox regression. (E) Patients were stratified into high-risk and low-risk groups based on their risk scores. (F) Patients in high-risk groups had significantly higher numbers of deaths during the follow-up period. ER, estrogen receptor; PR, progesterone receptor; BMI, body mass index; LNM, lymph node metastasis.

The present study's model demonstrated exceptional predictive capability, with AUC values exceeding 0.85 for both cohorts over a 5-year period. Notably, the training and validation cohorts achieved an AUC value of 0.99 and 0.925 at 1 year, respectively (Fig. 4A and C, respectively). In the training cohort, patients in the low-risk group had a significantly improved RFS compared with those in the high-risk group, with a P=9.41×10−8 (Fig. 4B). The validation cohort revealed similar outcomes, with a P=2.715×10−3 (Fig. 4D). These results indicated that the present study's model provides outstanding predictive performance for both OS and RFS.

Figure 4.

Combination of the stratification and aforementioned model. The RFS ROC curves and Kaplan-Meier survival curves were plotted for both the training and validation cohorts of patients with EC. (A and B) RFS ROC curves and Kaplan-Meier survival curves of the training cohorts. (C and D) RFS ROC curves and Kaplan-Meier survival curves of the validation cohorts. RFS, recurrence-free survival; ROC; receiver operating characteristic; AUC, area under curve.

Advantages of introducing IHC markers

To evaluate the enhanced predictive efficacy of incorporating IHC markers for OS and RFS in patients with EC, the IHC-related indicators were removed and the impact on curve values for both scenarios was assessed, pre- and post-exclusion. It was found that including these four factors, actually improved diagnostic accuracy, with one AUC value increasing from 0.765 to 0.872 (Fig. 5A) and another from 0.791 to 0.882 (Fig. 5B).

Figure 5.

Alterations in model predictive capabilities following the incorporation of four IHC results. Solid red line represents model 1, which includes basic patient information; solid blue line represents model 2, which incorporates four additional IHC indicators. (A) Overall survival ROC curves in the entire sample. (B) Recurrence-free survival ROC curves in the entire sample. IHC, immunohistochemical; ROC; receiver operating characteristic; AUC, area under curve.

In summary, the model of the present study demonstrated robust predictive performance, demonstrating high accuracy and reliability in forecasting both OS and RFS across patients with EC. This predictive capability highlights its potential utility in clinical decision-making and personalized treatment planning.

Discussion

EC ranks as the second most common gynecologic malignancy, with increasing incidence and mortality rates (4). In China, EC exhibits similar trends, with five-year survival rates varying based on FIGO staging. For patients diagnosed at an early stage (FIGO stage I), the five-year survival rate is ~90%. By contrast, for those with advanced-stage disease (FIGO stage IV), the survival rate significantly declines to ~15% (11). Research has identified numerous indicators that are strongly associated with poor prognosis in patients with EC (12,13). However, there is currently no comprehensive scoring system that assigns weights to these indicators and calculates a risk score for each patient. Such a system would enable stratification of OS and RFS risk levels. Therefore, there is an urgent need to develop an effective method to optimize treatment selection and improve patient survival outcomes.

In the present study, a predictive model was developed and validated to estimate the prognosis of patients with EC in terms of OS and RFS. These findings revealed that the model incorporating IHC indices exhibits superior predictive value compared with clinical models. Information on four IHC-related markers was included: ER, PR, Ki67 and P53. The emphasis on IHC results is well-supported, as numerous studies have revealed that these factors are strongly correlated with disease malignancy (14,15). Furthermore, some of these indices can indicate the molecular subtype of the disease, which is particularly beneficial for patients who cannot undergo genetic testing. This provides significant clinical advantages. When the present predictive model identifies a patient as belonging to the high-risk group, it guides clinicians to promptly administer an appropriate and comprehensive chemotherapy regimen following staging surgery, with the goal of improving the patient's long-term survival rate.

It has been indicated that pre-operative IHC biomarkers effectively evaluate patient prognosis, guiding subsequent surgical and adjuvant treatment plans. A previous study assessed the accuracy of P53 IHC in predicting TP53 mutations identified by next-generation sequencing in EC biopsy samples, finding a concordance rate of ≥95% (16). Moreover, IHC for P53, either alone or in combination with TP53 sequencing, is particularly useful for identifying specific high-risk tumor genotypes/phenotypes, which significantly improves patient outcomes (17).

A large retrospective study investigated the impact of ER expression on oncologic outcomes within a new risk classification for EC. The aforementioned study, which included 891 patients with EC, found that the ER 01+ phenotype was significantly associated with more advanced stages, higher rates of metastasis, and poorer prognoses (18). Current research confirms that incorporating the absence of ER and PR into clinical risk stratification helps identify high-risk patients with stage I–II EEA (19). Additionally, the absence of PR expression is an important independent predictor of tumor recurrence in these patients (20). Multivariate regression analysis has established that a Ki67 index of ≥33% is a significant independent predictor of recurrence. Patients with high Ki67 levels had notably poorer RFS and OS compared with those with lower Ki67 levels (P<0.001 and P=0.029, respectively) (21). The combined prognostic value of ER, PR and P53 with Ki67 surpassed the predictive accuracy of each individual marker. However, to date, no studies have combined oncological behavior with IHC expression to jointly predict OS and RFS in patients with EC. Additionally, research utilizing advanced technologies such as machine learning to enhance predictive accuracy in this context remains lacking.

Furthermore, the present study's model can assist patients with EC who have ambiguous FIGO staging by stratifying them based on their risk scores. This stratification allows us to refine the surgical plan and ensure a more comprehensive resection. Predictive models are already widely used in the preoperative diagnosis of EC. LNM is a significant risk factor for poor long-term prognosis, with LVSI (22) and a high metabolic syndrome score (23) serving as indicators for its occurrence. For instance, Yang et al (24) developed a nomogram to predict the probability of lymph node positivity in patients with stage IIIC EC. This nomogram demonstrated higher efficacy compared with FIGO staging. Moreover, numerous emerging indicators have been revealed to be associated with patient prognosis, including L1CAM (25), EPPK1 (26), FOXM1 (27) and TNFRSF4 (28). In the future, the authors plan to incorporate these indicators to further refine and enhance the predictive model. Compared with previous models developed at Peking University People's Hospital, the model in the present study demonstrated significant improvements. Notably, the incorporation of IHC indicators has substantially enhanced the predictive efficacy of this model.

With advancements in algorithms, machine learning has become widely used in model construction. Several studies have evaluated the impact of different algorithms on improving model performance. A recent study found that Random Forest is optimal for assessing OS and RFS in high-grade EC (29). Additionally, a model incorporating the latest algorithms can preoperatively predict the histology, stage and grade of EC, thereby assisting doctors in achieving more accurate diagnoses and predictive outcomes (30). By evaluating 92 algorithm combinations, a scoring criterion was established to calculate individual risk scores for each patient. This scoring system allowed to stratify patients into low-risk and high-risk groups. The OS and RFS rates at 1, 3 and 5 years for each group were calculated. In both the training and validation cohorts, the AUC values demonstrated favorable performance across the three time points. Notably, including four indicators significantly enhanced the AUC values for both OS and RFS, strongly supporting the validity of the hypothesis. For example, patients with stage IA EC typically do not receive chemotherapy after comprehensive staging surgery. However, their risk of recurrence remains relatively high after 5 years. In such cases, the model of the present study could be used to evaluate the patient by collecting their clinical data and pathological information. If the model indicates that the patient is ‘high risk’, consideration could be given to administering a PC regimen (paclitaxel + platinum-based chemotherapy) in hopes of achieving improved long-term survival outcomes. Overall, a robust predictive model that greatly supports the development of precise treatment strategies for patients with EC with EC has been developed.

The model can be easily replicated by using patient demographics and IHC outcomes, which facilitates clinical application and adoption. However, several limitations must be acknowledged. First, the data were derived from a single institution, which necessitates further external validation to confirm the reliability of the model. Furthermore, AI models were not applied in the process of obtaining pathology reports. Although the reports were jointly reviewed by two experienced pathologists, heterogeneity still exists. Additionally, the authors are planning a prospective study to determine whether this model improves clinical outcomes in patients with risk stratification. Due to limitations in the present study duration, results from the present study are not yet available for publication. Finally, the authors have not developed a publicly accessible platform, such as a website, for physicians to use in prognosticating patient outcomes with EC. The absence of such a tool may have hindered the broader dissemination and practical application of our predictive model in clinical settings. Nonetheless, to the best of the authors' knowledge, this is the first study to incorporate these four IHC results as indicators and to use the largest sample size. Further multi-center validations and subsequent prospective studies are necessary to assess the effectiveness of this model in real-world scenarios.

Supplementary Material

Supporting Data

Acknowledgements

Not applicable.

Funding

The present study was supported by the National Key Technology Research and Development Program of China (grant nos. 2022YFC2704400 and 2022YFC2704401), the Research and Development Fund of Peking University People's Hospital (grant no. RDJP2023-19), the National Natural Science Foundation of China (grant nos. 82103419, 82230050 and 81874108) and the Natural Science Foundation of Beijing Municipality (grant no. 7234394).

Availability of data and materials

The data generated in the present study may be requested from the corresponding author.

Authors' contributions

RQW, JYW, XCL and JLW contributed to the study conception and design. AXZ, YMW and XCL performed material preparation, data collection and analysis. RQW and JYW wrote the first draft of the manuscript. AXZ, JYW, XCL and JLW provided comments on previous versions of the manuscript. XCL and JLW confirm the authenticity of all the raw data. All authors read and approved the final version of the manuscript.

Ethics approval and consent to participate

The present study was approved (approval no. 2022PHB379) by the Ethics Committee Board of Peking University People's Hospital, in accordance with the principles of the Declaration of Helsinki. Informed consent was obtained from all patients.

Patient consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

References

1	Makker V, MacKay H, Ray-Coquard I, Levine DA, Westin SN, Aoki D and Oaknin A: Endometrial cancer. Nat Rev Dis Primers. 7:882021. View Article : Google Scholar : PubMed/NCBI
2	Medina HN, Penedo FJ, Joachim C, Deloumeaux J, Koru-Sengul T, Macni J, Bhakkan B, Peruvien J, Schlumbrecht MP and Pinheiro PS: Endometrial cancer risk and trends among distinct African descent populations. Cancer. 129:2717–2726. 2023. View Article : Google Scholar : PubMed/NCBI
3	Piechocki M, Koziołek W, Sroka D, Matrejek A, Miziołek P, Saiuk N, Sledzik M, Jaworska A, Bereza K, Pluta E and Banas T: Trends in incidence and mortality of gynecological and breast cancers in Poland (1980–2018). Clin Epidemiol. 14:95–114. 2022. View Article : Google Scholar : PubMed/NCBI
4	Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A and Bray F: Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 71:209–249. 2021. View Article : Google Scholar : PubMed/NCBI
5	Miller KD, Siegel RL, Lin CC, Mariotto AB, Kramer JL, Rowland JH, Stein KD, Alteri R and Jemal A: Cancer treatment and survivorship statistics, 2016. CA Cancer J Clin. 66:271–289. 2016. View Article : Google Scholar : PubMed/NCBI
6	Mitric C and Bernardini MQ: Endometrial cancer: Transitioning from histology to genomics. Curr Oncol. 29:741–757. 2022. View Article : Google Scholar : PubMed/NCBI
7	Cancer Genome Atlas Research Network, . Kandoth C, Schultz N, Cherniack AD, Akbani R, Liu Y, Shen H, Robertson AG, Pashtan I, Shen R, et al: Integrated genomic characterization of endometrial carcinoma. Nature. 497:67–73. 2013. View Article : Google Scholar : PubMed/NCBI
8	Bruno V, Betti M, D'Ambrosio L, Massacci A, Chiofalo B, Pietropolli A, Piaggio G, Ciliberto G, Nisticò P, Pallocca M, et al: Machine learning endometrial cancer risk prediction model: Integrating guidelines of European society for medical oncology with the tumor immune framework. Int J Gynecol Cancer. 33:1708–1714. 2023. View Article : Google Scholar : PubMed/NCBI
9	Perrone E, De Felice F, Capasso I, Distefano E, Lorusso D, Nero C, Arciuolo D, Zannoni GF, Scambia G and Fanfani F: The immunohistochemical molecular risk classification in endometrial cancer: A pragmatic and high-reproducibility method. Gynecol Oncol. 165:585–593. 2022. View Article : Google Scholar : PubMed/NCBI
10	Rossi EC, Kowalski LD, Scalici J, Cantrell L, Schuler K, Hanna RK, Method M, Ade M, Ivanova A and Boggess F: A comparison of sentinel lymph node biopsy to lymphadenectomy for endometrial cancer staging (FIRES trial): A multicentre, prospective, cohort study. Lancet Oncol. 18:384–392. 2017. View Article : Google Scholar : PubMed/NCBI
11	Chen W, Zheng R, Baade PD, Zhang S, Zeng H, Bray F, Jemal A, Yu XQ and He J: Cancer statistics in China, 2015. CA Cancer J Clin. 66:115–132. 2016. View Article : Google Scholar : PubMed/NCBI
12	Njoku K, Barr CE and Crosbie EJ: Current and emerging prognostic biomarkers in endometrial cancer. Front Oncol. 12:8909082022. View Article : Google Scholar : PubMed/NCBI
13	Coll-de la Rubia E, Martinez-Garcia E, Dittmar G, Gil-Moreno A, Cabrera S and Colas E: Prognostic biomarkers in endometrial cancer: A systematic review and meta-analysis. J Clin Med. 9:19002020. View Article : Google Scholar : PubMed/NCBI
14	Vrede SW, van Weelden WJ, Visser NCM, Bulten J, van der Putten LJM, van de Vijver K, Santacana M, Colas E, Gil-Moreno A, Moiola CP, et al: Immunohistochemical biomarkers are prognostic relevant in addition to the ESMO-ESGO-ESTRO risk classification in endometrial cancer. Gynecol Oncol. 161:787–794. 2021. View Article : Google Scholar : PubMed/NCBI
15	Talhouk A, McConechy MK, Leung S, Yang W, Lum A, Senz J, Boyd N, Pike J, Anglesio M, Kwon JS, et al: Confirmation of ProMisE: A simple, genomics-based clinical classifier for endometrial cancer. Cancer. 123:802–813. 2017. View Article : Google Scholar : PubMed/NCBI
16	Singh N, Piskorz AM, Bosse T, Jimenez-Linan M, Rous B, Brenton JD, Gilks CB and Köbel M: p53 immunohistochemistry is an accurate surrogate for TP53 mutational analysis in endometrial carcinoma biopsies. J Pathol. 250:336–345. 2020. View Article : Google Scholar : PubMed/NCBI
17	Thiel KW, Devor EJ, Filiaci VL, Mutch D, Moxley K, Secord AA, Tewari KS, McDonald ME, Mathews C, Cosgrove C, et al: TP53 sequencing and p53 immunohistochemistry predict outcomes when bevacizumab is added to frontline chemotherapy in endometrial cancer: An NRG Oncology/Gynecologic oncology group study. J Clin Oncol. 40:3289–3300. 2022. View Article : Google Scholar : PubMed/NCBI
18	Perrone E, Capasso I, De Felice F, Giannarelli D, Dinoi G, Petrecca A, Palmieri L, Foresta A, Nero C, Arciuolo D, et al: Back to the future: The impact of oestrogen receptor profile in the era of molecular endometrial cancer classification. Eur J Cancer. 186:98–112. 2023. View Article : Google Scholar : PubMed/NCBI
19	Guan J, Xie L, Luo X, Yang B, Zhang H, Zhu Q and Chen X: The prognostic significance of estrogen and progesterone receptors in grade I and II endometrioid endometrial adenocarcinoma: Hormone receptors in risk stratification. J Gynecol Oncol. 30:e132019. View Article : Google Scholar : PubMed/NCBI
20	Huvila J, Talve L, Carpén O, Edqvist PH, Pontén F, Grénman S and Auranen A: Progesterone receptor negativity is an independent risk factor for relapse in patients with early stage endometrioid endometrial adenocarcinoma. Gynecol Oncol. 130:463–469. 2013. View Article : Google Scholar : PubMed/NCBI
21	Jia M, Pi J, Zou J, Feng M, Chen H, Lin C, Yang S and Xiao X: The potential value of ki-67 in prognostic classification in early low-risk endometrial cancer. Cancer Control. 30:107327482312069292023. View Article : Google Scholar : PubMed/NCBI
22	Wang Z, Zhang S, Ma Y, Li W, Tian J and Liu T: A nomogram prediction model for lymph node metastasis in endometrial cancer patients. BMC Cancer. 21:7482021. View Article : Google Scholar : PubMed/NCBI
23	Feng X, Li XC, Yang X, Cheng Y, Dong YY, Wang JY, Zhou JY and Wang JL: Metabolic syndrome score as an indicator in a predictive nomogram for lymph node metastasis in endometrial cancer. BMC Cancer. 23:6222023. View Article : Google Scholar : PubMed/NCBI
24	Yang XL, Huang H, Kou LN, Lai H, Chen XP and Wu DJ: Construction and validation of a prognostic model for stage IIIC endometrial cancer patients after surgery. Eur J Surg Oncol. 48:1173–1180. 2022. View Article : Google Scholar : PubMed/NCBI
25	van der Putten LJM, Visser NCM, van de Vijver K, Santacana M, Bronsert P, Bulten J, Hirschfeld M, Colas E, Gil-Moreno A, Garcia A, et al: Added value of estrogen receptor, progesterone receptor, and L1 cell adhesion molecule expression to histology-based endometrial carcinoma recurrence prediction models: An ENITEC collaboration study. Int J Gynecol Cancer. 28:514–523. 2018. View Article : Google Scholar : PubMed/NCBI
26	Liu L, Yuan S, Yao S, Cao W and Wang L: EPPK1 as a prognostic biomarker in type I endometrial cancer and its correlation with immune infiltration. Int J Gen Med. 17:1677–1694. 2024. View Article : Google Scholar : PubMed/NCBI
27	Chen J, Yang P, Li S and Feng Y: Increased FOXM1 expression was associated with the prognosis and the recruitment of neutrophils in endometrial cancer. J Immunol Res. 2023:54375262023. View Article : Google Scholar : PubMed/NCBI
28	Ma H, Feng PH, Yu SN, Lu ZH, Yu Q and Chen J: Identification and validation of TNFRSF4 as a high-profile biomarker for prognosis and immunomodulation in endometrial carcinoma. BMC Cancer. 22:5432022. View Article : Google Scholar : PubMed/NCBI
29	Piedimonte S, Feigenberg T, Drysdale E, Kwon J, Gotlieb WH, Cormier B, Plante M, Lau S, Helpman L, Renaud MC, et al: Predicting recurrence and recurrence-free survival in high-grade endometrial cancer using machine learning. J Surg Oncol. 126:1096–1103. 2022. View Article : Google Scholar : PubMed/NCBI
30	Feng Y, Wang Z, Xiao M, Li J, Su Y, Delvoux B, Zhang Z, Dekker A, Xanthoulea S, Zhang Z, et al: An applicable machine learning model based on preoperative examinations predicts histology, stage, and grade for endometrial cancer. Front Oncol. 12:9045972022. View Article : Google Scholar : PubMed/NCBI

	Training cohort	Validation cohort

Variables	Mean ± SD	Mean ± SD	P-value
Age at diagnosis	56.49±9.33	55.85±9.59	0.457
Body mass index (kg/m²)	26.35±4.59	26.18±4.12	0.780
ER percentage	0.80±0.29	0.68±0.34	0.183
PR percentage	0.76±0.34	0.70±0.36	0.716
P53 percentage	0.37±0.45	0.35±0.45	0.838
Ki67 percentage	0.35±0.24	0.38±0.21	0.394
Overall survival time (days)	2106.24±1350.77	2185.26±1413.85	0.629
Myometrial infiltration			0.482
<50%	426 (76.62)	220 (79.14)
≥50%	130 (23.38)	58 (20.86)
Cervical invasion			0.713
Negative	490 (88.13)	256 (92.09)
Positive	66 (11.87)	22 (7.91)
FIGO stage			0.859
I	430 (77.34)	237 (85.25)
II	30 (5.40)	12 (4.32)
III	82 (14.75)	22 (7.91)
IV	14 (2.52)	7 (2.52)
Grade			0.299
G1	205 (36.87)	106 (38.13)
G2	239 (42.99)	126 (45.32)
G3	112 (20.14)	46 (16.55)

Variables	N (%)	N (%)	P-value
Menopause status			0.886
Premenopausal	192 (34.53)	101 (36.33)
Postmenopausal	364 (65.47)	177 (63.67)
Diabetes mellitus			0.791
Without	423 (76.08)	213 (76.62)
With	133 (23.92)	65 (23.38)
Hypertension
Without	319 (57.37)	170 (61.15)	0.224
With	237 (42.63)	108 (38.85)
Number of ER +			0.395
0	29 (5.06)	22 (8.44)
1	352 (63.29)	200 (72.15)
2	112 (20.25)	28 (9.70)
3	63 (11.39)	28 (9.70)
Number of PR +			0.718
0	39 (7.17)	30 (12.66)
1	385 (69.20)	164 (69.20)
2	66 (11.81)	9 (3.80)
3	66 (11.81)	34 (14.35)
Number of P53			0.201
0	217 (39.24)	123 (44.30)
1	317 (56.97)	143 (51.48)
2	15 (2.53)	5 (1.69)
3	7 (1.27)	7 (2.53)
Survival status			0.528
Alive	498 (89.57)	253 (91.01)
Death	58 (10.43)	25 (8.99)
Ascites cytology			0.872
Negative	500 (92.25)	259 (95.22)
Positive	42 (7.75)	13 (4.78)
Histology			0.946
Endometrioid endometrial adenocarcinoma	508 (91.37)	252 (90.65)
Other types	48 (8.63)	26 (9.35)
Lymph node metastasis			0.163
Negative	502 (90.29)	260 (93.53)
Positive	54 (9.71)	18 (6.47)
Lymph-vascular space invasion			0.844
Negative	460 (82.73)	229 (82.37)
Positive	96 (17.27)	49 (17.63)

Journals

International Journal of Molecular Medicine

International Journal of Oncology

Molecular Medicine Reports

Oncology Reports

Experimental and Therapeutic Medicine

Oncology Letters

Biomedical Reports

Molecular and Clinical Oncology

World Academy of Sciences Journal

International Journal of Functional Nutrition

International Journal of Epigenetics

Medicine International

Immunohistochemistry as a reliable predictor of remission in patients with endometrial cancer: Establishment and validation of a machine learning model

This article is mentioned in:

Abstract

Introduction

Materials and methods

Patient population

IHC

Construction of prognostic model

Statistical analysis

Results

Clinical and pathological feature

Table I.

Table I.

Table II.

Table II.

Establishment of machine-learning model for pathology prediction

Table III.

Table III.

Figure 1.

Evaluation of the pathological prediction model in OS

Figure 2.

Application of this model in recurrence-free survival (RFS)

Figure 3.

Figure 4.

Advantages of introducing IHC markers

Figure 5.

Discussion

Supplementary Material

Supporting Data

Supporting Data

Acknowledgements

Funding

Availability of data and materials

Authors' contributions

Ethics approval and consent to participate

Patient consent for publication

Competing interests

References

Related Articles