Copyright
©The Author(s) 2025.
World J Transplant. Sep 18, 2025; 15(3): 103536
Published online Sep 18, 2025. doi: 10.5500/wjt.v15.i3.103536
Published online Sep 18, 2025. doi: 10.5500/wjt.v15.i3.103536
Table 6 Comparative performance of ChatGPT and GPT-4 in department cases on liver transplantation, detailing agreement levels by task type
Case ID | Question number | Task | Performance, ChatGPT/GPT-4 | Physicians course of action/ground truth | Agreement status, ChatGPT/GPT-4 |
1 | 1 | Case presentation/provide a DD for the patient | Provided a DD that included the final diagnosis/provided a DD that included the final diagnosis | Early anastomotic bile leak | A/A |
2 | Provide the most probable diagnosis | Suggested that a biliary complication including bile leak as the most probable diagnosis/suggested bile leak as the most probable diagnosis | Early anastomotic bile leak | A/A | |
3 | Suggest a suitable diagnostic test to confirm the diagnosis | Suggested considering abdominal US or CT, and MRCP/suggested considering abdominal US or CT, fluid drain analysis, and MRCP | Abdominal CT and fluid drain analysis were performed | PA/A | |
4 | Suggest a suitable treatment for this patient | Suggested considering percutaneous drainage, ERCP, surgical intervention, and antibiotics if there are signs of infection/suggested considering as a first line less invasive treatments such as percutaneous drainage and ERCP and procced with re-exploration if those fail, while covering the patient with antibiotics | Antibiotics were commenced, followed by an ERCP which did not resolve the bile leak and the patient was re-explored | A/A | |
2 | 5 | Case presentation/calculate CP score, MELD score, and MELD-sodium score | Accurately calculated CP score and MELD score, underestimated MELD-sodium score/accurately calculated the required scores | CP score = 13, MELD score = 34, and MELD-sodium score = 37 | PA/A |
6 | Patient’s pre-operative assessment findings presented/evaluate patient’s eligibility to proceed with transplantation | Suggested that it’s likely that the operation was postponed or deferred until the patient's condition improved/suggested that given the findings the transplant team would have opted to delay the liver transplantation until active issues were adequately addressed | Transplantation did not proceed | A/A | |
3 | 7 | Case presentation/provide a DD for the patient | Provided a DD that did not include the final diagnosis/provided a DD that did not include the final diagnosis | PLS | D/D |
8 | Provide the most probable diagnosis | Suggested acute cellular rejection as the most probable diagnosis/suggested acute hemolytic transfusion reaction | PLS | D/D | |
9 | Suggest treatment options for the patient | Suggested high-dose of intravenous corticosteroids, other anti-rejection medications, and plasmapheresis/suggested not furtherly transfusing the patient, administer corticosteroids, and monitor the patient | Patient was treated with high-dose corticosteroids, plasmapheresis, and intravenous immunoglobulin | PA/D | |
10 | Given the patient’s 3-month new signs/symptoms (recurrent ascites, low-grade fever etc.), provide a new DD for the patient | Provided a DD that included the final diagnosis/provided a DD that included the final diagnosis | PTLD | A/A | |
11 | Provide the most probable diagnosis | Suggested PTLD as the most probable diagnosis/suggested nephrotic syndrome as the most probable diagnosis | PTLD | A/D | |
4 | 12 | Case presentation/ suggest the most suitable diagnostic test | Brain imaging was suggested/suggested brain imaging, EEG, and tacrolimus level test | A brain CT, EEG, and tacrolimus level test were performed | PA/A |
13 | Provide a DD for the patient | Provided a DD that included the final diagnosis/provided a DD that included the final diagnosis | PRES | A/A | |
14 | Provide the most probable diagnosis | Suggested PRES as the most probable diagnosis/suggested tacrolimus neurotoxicity as the most probable diagnosis | PRES | A/D | |
5 | 15 | Case presentation/provide DD for the patient | Provided a DD that included the final diagnosis/provided a DD that included the final diagnosis | GVHD | A/A |
16 | Provide most probable diagnosis | Suggested CMV infection as the most probable diagnosis/suggest CMV infection as the most probable diagnosis | GVHD | D/D | |
17 | Suggest appropriate diagnostic tests | Suggested CMV testing, biopsy, and imaging studies/suggested CMV testing, imaging studies, and skin biopsy | Peripheral blood flow cytometry, colonoscopy, and skin biopsy were performed | PA/PA |
- Citation: Christou CD, Sitsiani O, Boutos P, Katsanos G, Papadakis G, Tefas A, Papalois V, Tsoulfas G. Comparison of ChatGPT-3.5 and GPT-4 as potential tools in artificial intelligence-assisted clinical practice in renal and liver transplantation. World J Transplant 2025; 15(3): 103536
- URL: https://www.wjgnet.com/2220-3230/full/v15/i3/103536.htm
- DOI: https://dx.doi.org/10.5500/wjt.v15.i3.103536