Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Duan F, Zhang S, Yan Y, Cai Z. An Oversampling Method of Unbalanced Data for Mechanical Fault Diagnosis Based on MeanRadius-SMOTE. Sensors (Basel) 2022;22:5166. [PMID: 35890845 DOI: 10.3390/s22145166] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 06/26/2022] [Accepted: 07/08/2022] [Indexed: 11/28/2022]

For:	Duan F, Zhang S, Yan Y, Cai Z. An Oversampling Method of Unbalanced Data for Mechanical Fault Diagnosis Based on MeanRadius-SMOTE. Sensors (Basel) 2022;22:5166. [PMID: 35890845 DOI: 10.3390/s22145166] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 06/26/2022] [Accepted: 07/08/2022] [Indexed: 11/28/2022]

Number

Cited by Other Article(s)

Zhang Y, Liu H, Huang Q, Qu W, Shi Y, Zhang T, Li J, Chen J, Shi Y, Deng R, Chen Y, Zhang Z. Predictive value of machine learning for in-hospital mortality risk in acute myocardial infarction: A systematic review and meta-analysis. Int J Med Inform 2025;198:105875. [PMID: 40073650 DOI: 10.1016/j.ijmedinf.2025.105875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 02/25/2025] [Accepted: 03/07/2025] [Indexed: 03/14/2025]

Buscarini L, Romano P, Cocco ES, Damiani C, Pournajaf S, Franceschini M, Infarinato F. Enhancing patient rehabilitation outcomes: artificial intelligence-driven predictive modeling for home discharge in neurological and orthopedic conditions. J Neuroeng Rehabil 2025;22:117. [PMID: 40420280 PMCID: PMC12105185 DOI: 10.1186/s12984-025-01654-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 05/15/2025] [Indexed: 05/28/2025] Open

Abstract

In recent years, the fusion of the medical and computer science domains has gained significant traction in the scientific research landscape. Progress in both fields has enabled the generation of a vast amount of data used for making predictions and identifying interesting clusters and pathways. The Machine Learning (ML) model's application in the medical domain is one of the most compelling and challenging topics to explore, bridging the gap between Artificial Intelligence (AI) and healthcare. The combination of AI and medical information offers the possibility to create tools that can benefit both healthcare providers and physicians. This enables the enhancement of rehabilitation therapy and patient care. In the rehabilitation context, this work provides an alternative perspective: prediction of patients' home discharge upon completing the rehabilitation protocol. Demographic and clinical data were collected on 7282 inpatients from electronic Medical Record, each record was categorized into Neurological Patients (NP, N = 3222) or Orthopedic Patients (OP, N = 4060). To identify the most suitable machine learning model, an extensive data preprocessing phase was conducted. This process involved variables recoding, scaling, and the evaluation of different dataset balancing methods to optimize model performance. Following a thorough review and comparison of algorithms commonly employed in the clinical-rehabilitative field, the Random Over Sampling (ROS) technique, in combination with the Random Forest (RF) machine learning model, was selected. Subsequently, a comprehensive hyperparameter tuning phase was performed using a grid search approach. The optimized model achieved an average accuracy of 98% for OP and 96% for NP, based on 10-fold cross-validation applied to the balanced training set (unrealistic scenario). When tested on the unbalanced dataset (real-world condition), the RF model maintained strong generalization performance, achieving 90% accuracy for OP and 83% for NP. This work points out the increasing importance of AI in medicine, especially in the realm of personalized rehabilitation. The use of such approaches could signify a transformative shift in healthcare. The integration of machine learning not only enhances the precision of treatment but also opens new possibilities for patient-centered care, improving outcomes and quality of care for individuals undergoing rehabilitation.

Collapse

Moradi R, Kashanian M, Yarigholi F, Pazouki A, Sheikhtaheri A. Predicting pregnancy at the first year following metabolic-bariatric surgery: development and validation of machine learning models. Surg Endosc 2025;39:2656-2667. [PMID: 40064691 DOI: 10.1007/s00464-025-11640-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Accepted: 02/21/2025] [Indexed: 03/26/2025]

Wang C, Jia P, Tian X, Tang X, Hu X, Li H. Fault Diagnosis of Semi-Supervised Electromechanical Transmission Systems Under Imbalanced Unlabeled Sample Class Information Screening. ENTROPY (BASEL, SWITZERLAND) 2025;27:175. [PMID: 40003172 PMCID: PMC11854703 DOI: 10.3390/e27020175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2024] [Revised: 01/20/2025] [Accepted: 01/23/2025] [Indexed: 02/27/2025]

Abstract

In the health monitoring of electromechanical transmission systems, the collected state data typically consist of only a minimal amount of labeled data, with a vast majority remaining unlabeled. Consequently, deep learning-based diagnostic models encounter the challenge of scarcity in labeled data and abundance in unlabeled data. Traditional semi-supervised deep learning methods based on pseudo-label self-training, while alleviating the issue of labeled data scarcity to some extent, neglect the reliability of pseudo-label information, the accuracy of feature extraction from unlabeled data, and the imbalance in sample selection. To address these issues, this paper proposes a novel semi-supervised fault diagnosis method under imbalanced unlabeled sample class information screening. Firstly, an information screening mechanism for unlabeled data based on active learning is established. This mechanism discriminates based on the variability of intrinsic feature information in fault samples, accurately screening out unlabeled samples located near decision boundaries that are difficult to separate clearly. Then, combining the maximum membership degree of these unlabeled data in the classification space of the supervised model and interacting with the active learning expert system, label information is assigned to the screened unlabeled data. Secondly, a cost-sensitive function driven by data imbalance is constructed to address the class imbalance problem in unlabeled sample screening, adaptively adjusting the weights of different class samples during model training to guide the training of the supervised model. Ultimately, through dynamic optimization of the supervised model and the feature extraction capability of unlabeled samples, the recognition ability of the diagnostic model for unlabeled samples is significantly enhanced. Validation through two datasets, encompassing a total of 12 experimental scenarios, demonstrates that in scenarios with only a small amount of labeled data, the proposed method achieves a diagnostic accuracy increment exceeding 10% compared to existing typical methods, fully validating the effectiveness and superiority of the proposed method in practical applications.

Collapse

Chahal A, Gulia P, Gill NS, Yahya M, Haq MA, Aleisa M, Alenizi A, Khan AA, Shukla PK. Predictive analytics technique based on hybrid sampling to manage unbalanced data in smart cities. Heliyon 2024;10:e39275. [PMID: 39759342 PMCID: PMC11697540 DOI: 10.1016/j.heliyon.2024.e39275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 10/09/2024] [Accepted: 10/10/2024] [Indexed: 01/07/2025] Open

Ishfaq M, Shah SZA, Ahmad I, Rahman Z. Multinomial classification of NLRP3 inhibitory compounds based on large scale machine learning approaches. Mol Divers 2024;28:1849-1868. [PMID: 37418166 DOI: 10.1007/s11030-023-10690-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 07/03/2023] [Indexed: 07/08/2023]

Yasin P, Yimit Y, Cai X, Aimaiti A, Sheng W, Mamat M, Nijiati M. Machine learning-enabled prediction of prolonged length of stay in hospital after surgery for tuberculosis spondylitis patients with unbalanced data: a novel approach using explainable artificial intelligence (XAI). Eur J Med Res 2024;29:383. [PMID: 39054495 PMCID: PMC11270948 DOI: 10.1186/s40001-024-01988-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 07/18/2024] [Indexed: 07/27/2024] Open

Abstract

BACKGROUND

Tuberculosis spondylitis (TS), commonly known as Pott's disease, is a severe type of skeletal tuberculosis that typically requires surgical treatment. However, this treatment option has led to an increase in healthcare costs due to prolonged hospital stays (PLOS). Therefore, identifying risk factors associated with extended PLOS is necessary. In this research, we intended to develop an interpretable machine learning model that could predict extended PLOS, which can provide valuable insights for treatments and a web-based application was implemented.

METHODS

We obtained patient data from the spine surgery department at our hospital. Extended postoperative length of stay (PLOS) refers to a hospitalization duration equal to or exceeding the 75th percentile following spine surgery. To identify relevant variables, we employed several approaches, such as the least absolute shrinkage and selection operator (LASSO), recursive feature elimination (RFE) based on support vector machine classification (SVC), correlation analysis, and permutation importance value. Several models using implemented and some of them are ensembled using soft voting techniques. Models were constructed using grid search with nested cross-validation. The performance of each algorithm was assessed through various metrics, including the AUC value (area under the curve of receiver operating characteristics) and the Brier Score. Model interpretation involved utilizing methods such as Shapley additive explanations (SHAP), the Gini Impurity Index, permutation importance, and local interpretable model-agnostic explanations (LIME). Furthermore, to facilitate the practical application of the model, a web-based interface was developed and deployed.

RESULTS

The study included a cohort of 580 patients and 11 features include (CRP, transfusions, infusion volume, blood loss, X-ray bone bridge, X-ray osteophyte, CT-vertebral destruction, CT-paravertebral abscess, MRI-paravertebral abscess, MRI-epidural abscess, postoperative drainage) were selected. Most of the classifiers showed better performance, where the XGBoost model has a higher AUC value (0.86) and lower Brier Score (0.126). The XGBoost model was chosen as the optimal model. The results obtained from the calibration and decision curve analysis (DCA) plots demonstrate that XGBoost has achieved promising performance. After conducting tenfold cross-validation, the XGBoost model demonstrated a mean AUC of 0.85 ± 0.09. SHAP and LIME were used to display the variables' contributions to the predicted value. The stacked bar plots indicated that infusion volume was the primary contributor, as determined by Gini, permutation importance (PFI), and the LIME algorithm.

CONCLUSIONS

Our methods not only effectively predicted extended PLOS but also identified risk factors that can be utilized for future treatments. The XGBoost model developed in this study is easily accessible through the deployed web application and can aid in clinical research.

Collapse

Hu WJ, Bai G, Wang Y, Hong DM, Jiang JH, Li JX, Hua Y, Wang XY, Chen Y. Predictive modeling for postoperative delirium in elderly patients with abdominal malignancies using synthetic minority oversampling technique. World J Gastrointest Oncol 2024;16:1227-1235. [PMID: 38660665 PMCID: PMC11037067 DOI: 10.4251/wjgo.v16.i4.1227] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 01/12/2024] [Accepted: 02/20/2024] [Indexed: 04/10/2024] Open

Abstract

BACKGROUND

Postoperative delirium, particularly prevalent in elderly patients after abdominal cancer surgery, presents significant challenges in clinical management.

AIM

To develop a synthetic minority oversampling technique (SMOTE)-based model for predicting postoperative delirium in elderly abdominal cancer patients.

METHODS

In this retrospective cohort study, we analyzed data from 611 elderly patients who underwent abdominal malignant tumor surgery at our hospital between September 2020 and October 2022. The incidence of postoperative delirium was recorded for 7 d post-surgery. Patients were divided into delirium and non-delirium groups based on the occurrence of postoperative delirium or not. A multivariate logistic regression model was used to identify risk factors and develop a predictive model for postoperative delirium. The SMOTE technique was applied to enhance the model by oversampling the delirium cases. The model's predictive accuracy was then validated.

RESULTS

In our study involving 611 elderly patients with abdominal malignant tumors, multivariate logistic regression analysis identified significant risk factors for postoperative delirium. These included the Charlson comorbidity index, American Society of Anesthesiologists classification, history of cerebrovascular disease, surgical duration, perioperative blood transfusion, and postoperative pain score. The incidence rate of postoperative delirium in our study was 22.91%. The original predictive model (P1) exhibited an area under the receiver operating characteristic curve of 0.862. In comparison, the SMOTE-based logistic early warning model (P2), which utilized the SMOTE oversampling algorithm, showed a slightly lower but comparable area under the curve of 0.856, suggesting no significant difference in performance between the two predictive approaches.

CONCLUSION

This study confirms that the SMOTE-enhanced predictive model for postoperative delirium in elderly abdominal tumor patients shows performance equivalent to that of traditional methods, effectively addressing data imbalance.

Collapse

Wang LZ, Chi JF, Ding YQ, Yao HY, Guo Q, Yang HQ. Transformer fault diagnosis method based on SMOTE and NGO-GBDT. Sci Rep 2024;14:7179. [PMID: 38531936 DOI: 10.1038/s41598-024-57509-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 03/19/2024] [Indexed: 03/28/2024] Open

Qiu B, Su XH, Qin X, Wang Q. Application of machine learning techniques in real-world research to predict the risk of liver metastasis in rectal cancer. Front Oncol 2022;12:1065468. [PMID: 36605425 PMCID: PMC9807609 DOI: 10.3389/fonc.2022.1065468] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 12/05/2022] [Indexed: 12/24/2022] Open

Abstract

Background

The liver is the most common site of distant metastasis in rectal cancer, and liver metastasis dramatically affects the treatment strategy of patients. This study aimed to develop and validate a clinical prediction model based on machine learning algorithms to predict the risk of liver metastasis in patients with rectal cancer.

Methods

We integrated two rectal cancer cohorts from Surveillance, Epidemiology, and End Results (SEER) and Chinese multicenter hospitals from 2010-2017. We also built and validated liver metastasis prediction models for rectal cancer using six machine learning algorithms, including random forest (RF), light gradient boosting (LGBM), extreme gradient boosting (XGB), multilayer perceptron (MLP), logistic regression (LR), and K-nearest neighbor (KNN). The models were evaluated by combining several metrics, such as the area under the curve (AUC), accuracy score, sensitivity, specificity and F1 score. Finally, we created a network calculator using the best model.

Results

The study cohort consisted of 19,958 patients from the SEER database and 924 patients from two hospitals in China. The AUC values of the six prediction models ranged from 0.70 to 0.95. The XGB model showed the best predictive power, with the following metrics assessed in the internal test set: AUC (0.918), accuracy (0.884), sensitivity (0.721), and specificity (0.787). The XGB model was assessed in the outer test set with the following metrics: AUC (0.926), accuracy (0.919), sensitivity (0.740), and specificity (0.765). The XGB algorithm also shows a good fit on the calibration decision curves for both the internal test set and the external validation set. Finally, we constructed an online web calculator using the XGB model to help generalize the model and to assist physicians in their decision-making better.

Conclusion

We successfully developed an XGB-based machine learning model to predict liver metastasis from rectal cancer, which was also validated with a real-world dataset. Finally, we developed a web-based predictor to guide clinical diagnosis and treatment strategies better.

Collapse

Li Y, Li B, Ji J, Kalhori H. Advanced Fault Diagnosis and Health Monitoring Techniques for Complex Engineering Systems. SENSORS (BASEL, SWITZERLAND) 2022;22:10002. [PMID: 36560370 PMCID: PMC9783385 DOI: 10.3390/s222410002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 11/11/2022] [Indexed: 06/17/2023]