Retrospective Study
Copyright ©The Author(s) 2024. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Clin Cases. May 26, 2024; 12(15): 2506-2521
Published online May 26, 2024. doi: 10.12998/wjcc.v12.i15.2506
Machine learning-based comparison of factors influencing estimated glomerular filtration rate in Chinese women with or without non-alcoholic fatty liver
I-Chien Chen, Lin-Ju Chou, Shih-Chen Huang, Ta-Wei Chu, Shang-Sen Lee
I-Chien Chen, Lin-Ju Chou, Shih-Chen Huang, Department of Nursing, Kaohsiung Armed Forces General Hospital, Kaohsiung 802, Taiwan
Ta-Wei Chu, Department of Obstetrics and Gynecology, Tri-Service General Hospital, National Defense Medical Center, Taipei 114, Taiwan
Ta-Wei Chu, Chief Executive Officer's Office, MJ Health Research Foundation, Taipei 114, Taiwan
Shang-Sen Lee, Department of Urology, Taichung Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Taichung 427, Taiwan
Shang-Sen Lee, School of Medicine, Tzu Chi University, Hualian 970, Taiwan
Shang-Sen Lee, Department of Urology, Tri-Service General Hospital, National Defense Medical Center, Taipei 114, Taiwan
Author contributions: Chen IC participated in study design and oversight; Chou LJ participated in study design and data collection; Huang SC participated in data collection and data analysis; Lee SS and Chu TW both participated in data analysis and drafted the manuscript. All authors read and approved the final manuscript.
Supported by the Kaohsiung Armed Forces General Hospital.
Institutional review board statement: Ethics approval and participant consent: The research plan was reviewed and approved by the Institutional Review Board of Kaohsiung Armed Forces General Hospital on July 1, 2023 prior to the start of the study.
Informed consent statement: All the authors consent to the publication.
Conflict-of-interest statement: All authors have no conflicts of interest.
Data sharing statement: Availability of data and materials: Data available on request due to privacy/ethical restrictions.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Shang-Sen Lee, PhD, Chief Physician, Department of Urology, Taichung Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, No. 88 Section 1, Fengxing Road, Tanzi Dist., Taichung 427, Taiwan. j520037@yahoo.com.tw
Received: January 2, 2024
Revised: February 13, 2024
Accepted: April 9, 2024
Published online: May 26, 2024
Processing time: 133 Days and 7 Hours
Abstract
BACKGROUND

The prevalence of non-alcoholic fatty liver (NAFLD) has increased recently. Subjects with NAFLD are known to have higher chance for renal function impairment. Many past studies used traditional multiple linear regression (MLR) to identify risk factors for decreased estimated glomerular filtration rate (eGFR). However, medical research is increasingly relying on emerging machine learning (Mach-L) methods. The present study enrolled healthy women to identify factors affecting eGFR in subjects with and without NAFLD (NAFLD+, NAFLD-) and to rank their importance.

AIM

To uses three different Mach-L methods to identify key impact factors for eGFR in healthy women with and without NAFLD.

METHODS

A total of 65535 healthy female study participants were enrolled from the Taiwan MJ cohort, accounting for 32 independent variables including demographic, biochemistry and lifestyle parameters (independent variables), while eGFR was used as the dependent variable. Aside from MLR, three Mach-L methods were applied, including stochastic gradient boosting, eXtreme gradient boosting and elastic net. Errors of estimation were used to define method accuracy, where smaller degree of error indicated better model performance.

RESULTS

Income, albumin, eGFR, High density lipoprotein-Cholesterol, phosphorus, forced expiratory volume in one second (FEV1), and sleep time were all lower in the NAFLD+ group, while other factors were all significantly higher except for smoking area. Mach-L had lower estimation errors, thus outperforming MLR. In Model 1, age, uric acid (UA), FEV1, plasma calcium level (Ca), plasma albumin level (Alb) and T-bilirubin were the most important factors in the NAFLD+ group, as opposed to age, UA, FEV1, Alb, lactic dehydrogenase (LDH) and Ca for the NAFLD- group. Given the importance percentage was much higher than the 2nd important factor, we built Model 2 by removing age.

CONCLUSION

The eGFR were lower in the NAFLD+ group compared to the NAFLD- group, with age being was the most important impact factor in both groups of healthy Chinese women, followed by LDH, UA, FEV1 and Alb. However, for the NAFLD- group, TSH and SBP were the 5th and 6th most important factors, as opposed to Ca and BF in the NAFLD+ group.

Keywords: Non-alcoholic fatty liver; Estimated glomerular filtration rate; Machine learning; Chinese women

Core Tip: We examined influential factors affecting the estimated glomerular filtration rate in healthy women with and without non-alcoholic fatty liver disease (NAFLD) by multiple linear regression and machine learning methods, with machine learning methods providing better performance and showing that age was the most important determining factor in both groups, followed by lactic dehydrogenase, uric acid, forced expiratory volume in one second, and albumin. However, for the NAFLD- group, the 5th and 6th most important impact factors were thyroid-stimulating hormone and systolic blood pressure, as compared to plasma calcium and body fat for the NAFLD+ group.