Basic Study
Copyright ©The Author(s) 2025.
World J Gastroenterol. Jan 21, 2025; 31(3): 101092
Published online Jan 21, 2025. doi: 10.3748/wjg.v31.i3.101092
Table 2 Performance of ChatGPT-3.5, ChatGPT-4.0 and Google Gemini on hepatitis B infection test questions by different subfields, n (%)
Test questions by subfields
ChatGPT-3.5, correct
ChatGPT-3.5, incorrect
ChatGPT-4.0, correct
ChatGPT-4.0, incorrect
Google Gemini, correct
Google Gemini, incorrect
All test questions525252
1st run34 (65.4)18 (34.6)43 (82.7)9 (17.3)37 (71.1)15 (28.9)
2nd run30 (57.7)22 (42.3)41 (78.9)11 (21.1)38 (73.1)14 (26.9)
3rd run34 (65.4)18 (34.6)42 (80.8)10 (19.2)39 (75)13 (25)
Concordance among 3 runs41 (78.9)46 (88.4)50 (96.2)
Total accuracy (%)62.980.873.1
Risk factors (n)555
1st run5 (100)0 (0)5 (100)0 (0)5 (100)0 (0)
2nd run5 (100)0 (0)5 (100)0 (0)5 (100)0 (0)
3rd run5 (100)0 (0)5 (100)0 (0)5 (100)0 (0)
Concordance among 3 runs5 (100)5 (100)5 (100)
Total accuracy (%)100100100
Clinical manifestation (n)777
1st run2 (40)5 (71.4)4 (57.1)3 (42.9)5 (71.4)2 (28.6)
2nd run2 (40)5 (71.4)4 (57.1)3 (42.9)5 (71.4)2 (28.6)
3rd run3 (42.9)4 (57.1)4 (57.1)3 (42.9)5 (71.4)2 (28.6)
Concordance among 3 runs5 (71.4)6 (85.7)7 (100)
Total accuracy (%)33.357.171.4
Diagnosis (n)181818
1st run9 (50)9 (50)15 (83.3)3 (16.7)13 (72.2)5 (27.8)
2nd run8 (44.4)10 (55.6)15 (83.3)3 (16.7)14 (77.8)4 (22.2)
3rd run11 (61,1)7 (38.9)15 (83.3)3 (16.7)15 (83.3)3 (16.7)
Concordance among 3 runs12 (66.7)16 (88.9)16 (88.9)
Total accuracy (%)51.983.377.8
Treatment (n)111111
1st run11 (100)0 (0)10 (90.9)1 (9.1)9 (81.9)2 (18.1)
2nd run10 (90.9)1 (9.1)10 (90.9)1 (9.1)9 (81.9)2 (18.1)
3rd run10 (90.9)1 (9.1)11 (100)0 (0)9 (81.9)2 (18.1)
Concordance among 3 runs10 (90.9)10 (90.9)11 (100)
Total accuracy (%)93.993.981.9
Prevention (n)777
1st run4 (57.1)3 (42.9)6 (85.7)1 (14.3)3 (42.9)4 (57.1)
2nd run3 (42.9)4 (57.1)4 (57.1)3 (42.9)3 (42.9)4 (57.1)
3rd run3 (42.9)4 (57.1)4 (57.1)3 (42.9)3 (42.9)4 (57.1)
Concordance among 3 runs6 (85.7)5 (71.4)7 (100)
Total accuracy (%)47.666.742.9
Prognosis (n)444
1st run3 (75)1 (25)3 (75)1 (25)2 (50)2 (50)
2nd run2 (50)2 (50)3 (75)1 (25)2 (50)2 (50)
3rd run2 (50)2 (50)3 (75)1 (25)2 (50)2 (50)
Concordance among 3 runs3 (75)4 (100)4 (100)
Total accuracy (%)58.37550